What are common pitfalls or gotchas with untrusted input and regex DoS?

keywords: regex, DoS, untrusted input, validation, performance
description: This article discusses common pitfalls associated with using regular expressions (regex) on untrusted input, particularly focusing on Denial of Service (DoS) vulnerabilities, and how to mitigate them for better performance.

    // Example of a vulnerable regex pattern
    $input = $_GET['user_input'];
    $pattern = '/^(a+)+$/';

    if (preg_match($pattern, $input)) {
        echo "Valid input!";
    } else {
        echo "Invalid input!";
    }

    // Explanation:
    // The above regex can cause catastrophic backtracking with untrusted input 
    // such as 'aaaaaaaaaaaaaaaaaaaaaab'. This might lead to a DoS scenario.
    

keywords: regex DoS untrusted input validation performance