One of the main goals in the design of Sigma is simplicity. The open source ruleset of the GitHub repository currently contains over 300 rule files and grows continuosly. Most of the detections from this repository are expressed with plain strings and wildcards for values and some boolean logic around them. We wanted to build a language that is able to express more than 90% of log signatures and leave the remaining, more complex signatures to other (not necessarily machine readable) languages. This simplicity gives Sigma users the possibility to write own Sigma parsers and build cool services with it.
Another (perhaps not so well-known) design decision was the extensibility. Sigma users are free to add custom attributes to Sigma rules. You need some versioning information directly in Sigma rules? The Sigma rule builds the foundation for some more sophisticated detection and you want to refer to this detection from your Sigma rule? Just add your custom attributes, the Sigma converter only cares about known parts of the rule and ignores the remaining stuff silently.
On the other side, we often receive feature requests from the Sigma users community that can't be expressed by addition of attributes. Regular expressions were already requested by many people, are supported by most target query languages and to be honest: they are likely one of the features I personally missed mostly in Sigma. There are other good ideas and requests and many of them share the similarity that values given in Sigma rules should be handled specially. This brought me the idea to introduce a new concept to Sigma that was recently merged into the master branch and enables us to extend Sigma with useful features. Furthermore, Sigma is now extensible on a language level which opens many new possibilities for people and organizations who use Sigma for advanced detections or completely different purposes.
Until now, Sigma rule detection definitions were quite simple:
[...] detection: selection: fieldname: value [...]
Value modifiers now allow to define that values should be handled differently by appending a pipe
| to the field name followed by a value modifier function:
[...] detection: selection: fieldname|base64: value [...]
The Sigma converter will now encode the value with Base64 and output the encoded value in the generated query. Content modifiers can be chained:
[...] detection: selection: fieldname|base64offset|contains: value [...]
The base64offset value modifier will generate the static part of all three possible Base64
variants of the value, by shifting it forward by an offset of up to two characters. Further, the
contains modifier puts
* wildcards around the generated values, such that they will be matched
at arbitrary positions in an Base64 encoded value.
There are two types of value modifiers:
- Transformation modifiers transform values into different values, like the two Base64 modifiers mentioned above. Furthermore, this type of modifier is also able to change the logical operation between values. Transformation modifiers are generally backend-agnostic. Means: you can use them with any backend.
- Type modifiers change the type of a value. The value itself might also be changed by such a modifier, but the main purpose is to tell the backend that a value should be handled differently by the backend, e.g. it should be treated as regular expression when the re modifier is used. Type modifiers must be supported by the backend.
Generally, value modifiers work on single values and value lists. A value might also expand into multiple values.
Currently Supported Modifiers
- contains: puts
*wildcards around the values, such that the value is matched anywhere in the field.
- all: Normally, lists of values were linked with OR in the generated query. This modifier changes this to AND. This is useful if you want to express a command line invocation with different parameters where the order may vary and removes the need for some cumbersome workarounds.
- base64: The value is encoded with Base64.
- base64offset: If a value might appear somewhere in a base64-encoded value the representation might change depending on the position in the overall value. There are three variants for shifts by zero to two bytes and except the first and last byte the encoded values have a static part in the middle that can be recognized.
- re: value is handled as regular expression by backends. Currently, this is only supported by the Elasticsearch query string backend (es-qs). Further (like Splunk) are planned or have to be implemented by contributors with access to the target systems.
Wildcards are often not sufficient to detect complex patterns. In these cases, regular
expressions can be a solution and with value
modifiers, Sigma now also has support for them. A concrete example is matching of the last directory
part of an executable image path. Wildcards obviously aren't powerful enough to express this:
*\\last\\*.exe. This can match
C:\first\second\last\malicious.exe which was the intention,
C:\first\second\last\further\not_malicious.exe, which was not. Regular expressions
have sufficient power to express this:
title: Process Creation in Temp Directory logsource: category: process_creation product: windows detection: selection: Image|re: '^.*\\Temp\\[^\]+.exe$' condition: selection
Conversion to an Elasticsearch query string results in the expected query:
$ tools/sigmac -t es-qs -c sysmon -c winlogbeat susp_temp_execution.yml (winlog.event_id:"1" AND winlog.channel:"Microsoft\-Windows\-Sysmon\/Operational" AND winlog.event_data.Image:/^.*\\Temp\\[^\]+.exe$/)
But beware: in most target systems, regular expressions are much more expensive than wildcards or matching of plain values. Don't use them if you can choose less expensive tools.
Base64 Offset Encoding
Imagine that an attacker encodes HTTP or HTTPS URLs in Base64 encoded script code that is passed on
the command line. The URLs can be located anywhere in the encoded script and might therefore encoded
differently. The possible encodings for
http:// can be calculated with
CyberChef. The static parts are
Surely, you can write the encoded values plain into the Sigma rule, which definitely doesn't improves the readability of the Sigma rule. With value modifiers you can put the plain values into your Sigma rule and transform them:
title: Base64-encoded URL in Command Line logsource: category: process_creation product: windows detection: selection: CommandLine|base64offset|contains: - 'http://' - 'https://' condition: selection
Converting the rule shows that both URL protocol specifiers expand into a bunch of Base64 encoded values which are matched anywhere in the command line (by the contains modifier). It even works across all backends, as the transformation value processing step is done before the conversion to the target kicks in:
$ tools/sigmac -t es-qs -c sysmon -c winlogbeat base64_encoded_urls.yml (winlog.event_id:"1" AND winlog.channel:"Microsoft\-Windows\-Sysmon\/Operational" AND winlog.event_data.CommandLine.keyword:(*aHR0cDovL* OR *h0dHA6Ly* OR *odHRwOi8v* OR *aHR0cHM6Ly* OR *h0dHBzOi8v* OR *odHRwczovL*)) $ tools/sigmac -t splunk -c sysmon -c splunk-windows base64_encoded_urls.yml (EventCode="1" source="WinEventLog:Microsoft-Windows-Sysmon/Operational" (CommandLine="*aHR0cDovL*" OR CommandLine="*h0dHA6Ly*" OR CommandLine="*odHRwOi8v*" OR CommandLine="*aHR0cHM6Ly*" OR CommandLine="*h0dHBzOi8v*" OR CommandLine="*odHRwczovL*"))
The foundation of this feature is implemented, some modifiers already exist and one backend supports regular expression. But there lots of stuff to do:
- Implementation regular expressions for the other backends.
- Implementation of further modifierds for obfuscation techniques or whatever you consider as useful
- Rewriting of rules where modifiers can be used to improve them.
A blog post about development of custom modifiers or backend support for types will be posted soon in this blog
Tell us about new ideas or if you plan to contribute something in a new issue.