User Tools

Site Tools


php:valsan

FILTER AND VALIDATE

20.05.2009

If you already run at least php ver 5.2, you could take advantage of the genuine functions for validation and filtering.
First, some code to show all the available filters in the system:

// show all the available filter
if ( function_exists('filter_list') ) {
    /* filter list found */
    echo '<table border=1>';
    echo '<tr><td>Filter Name</td><td>Filter ID</td></tr>';
    foreach (filter_list() as $id => $filter ) {
        echo '<tr><td>'.$filter.'</td><td>'.filter_id($filter).'</td></tr>'."\n";
    }
    echo '</table>';
} else {
    die('Error: Filters not found.');
}

VALIDATION

Validation is used to validate or check if the data meets certain qualifications. For example, passing in FILTER_VALIDATE_EMAIL will determine if the data is a valid email address, but will not change the data itself.

FILTER_VALIDATE_BOOLEAN (boolean)

Returns TRUE for “1”, “true”, “on” and “yes”. Returns FALSE otherwise.
If FILTER_NULL_ON_FAILURE is set, FALSE is returned only for “0”, “false”, “off”, “no”, and “”, and NULL is returned for all non-boolean values.

$varArr = array(1, TRUE, 'on', 'yes', 'no', NULL, 0, 0.1);
foreach ( $varArr as $value ) {
    echo "VALUE IS: $value ";
    $validation = filter_var($value, FILTER_VALIDATE_BOOLEAN);
    var_dump($validation); echo "<br>";
}

Result is:

VALUE IS: 1 bool(true)
VALUE IS: 1 bool(true)
VALUE IS: on bool(true)
VALUE IS: yes bool(true)
VALUE IS: no bool(false)
VALUE IS: bool(false)
VALUE IS: 0 bool(false)
VALUE IS: 0.1 bool(false) 

FILTER_VALIDATE_EMAIL (validate_email)

If a valid email address, then returns it. Otherwise, return FALSE.

$varArr = array('john.doe@example.com', '12SASA#%$%AS', 'johndoe@a.net');
foreach ( $varArr as $value ) {
    echo "EMAIL IS: $value ";
    $validation = filter_var($value, FILTER_VALIDATE_EMAIL);
    var_dump($validation); echo "<br>";
}

FILTER_VALIDATE_FLOAT (float)

It returns the value itself on TRUE. '12.3'(string) is the same as 12.3(float)

$varArr = array(123.123, 1234, '12.3', 'asd');
foreach ( $varArr as $value ) {
    echo "FLOAT: $value ";
    $validation = filter_var($value, FILTER_VALIDATE_FLOAT);
    var_dump($validation); echo "<br>";
}

If you want to use also a thousand separator (,), put the right flag in the filter.

echo filter_var('12,324.23', FILTER_VALIDATE_FLOAT, array('flags' => FILTER_FLAG_ALLOW_THOUSAND));

If you don't want to use the point as decimal separator, you can specify it in options:

// here the decimal separator is ,
echo filter_var('12324,12', FILTER_VALIDATE_FLOAT, array('options' => array('decimal' => ',')) );

FILTER_VALIDATE_INT (int)

Note: Instead applying filter_var function to each value of an array, you can just feed the whole array to filter_var_array and then iterate the result, as shown below:

$varArr = array(1234, '123', '123sdas', '123', 0x123);
$validatedArr = filter_var_array($varArr, FILTER_VALIDATE_INT);
foreach ($validatedArr as $int) {
    var_dump($int); echo '<br/>';
}

For 0x123 the decimal value (291) will be returned. If you want to treat as valid also the octals (or hexa) from the strings, just use the flags as shown below (for the octals used NOT inside strings, no flag is required (e.g. 0666 instead of '0666')):

$octal = '0666';
echo filter_var($octal, FILTER_VALIDATE_INT, array('flags' => FILTER_FLAG_ALLOW_OCTAL)). '<br>'; // with the flag returns 438
echo filter_var($octal, FILTER_VALIDATE_INT). '<br>'; // without the flag return FALSE
 
$hexa = '0x345';
echo filter_var($octal, FILTER_VALIDATE_INT, array('flags' => FILTER_FLAG_ALLOW_HEX && FILTER_FLAG_ALLOW_OCTAL)). '<br>'; // with the flag returns 438
echo filter_var($octal, FILTER_VALIDATE_INT). '<br>'; // without the flag return FALSE

Optionally, you can validate a range:

$intOpts = array( "options" => array("min_range" => 0, "max_range" => 256));
var_dump(filter_var(350, FILTER_VALIDATE_INT, $intOpts)); // FALSE

FILTER_VALIDATE_IP (validate_ip)

echo filter_var('192.168.2.1', FILTER_VALIDATE_IP, array('flags' => FILTER_FLAG_IPV4)). '<br>'; // returns the IP because it's IPV4
echo filter_var('2001:db8:85a3:0:0:8a2e:370:7334', FILTER_VALIDATE_IP, array('flags' => FILTER_FLAG_IPV6)). '<br>'; // returns the IP because it's IPV6
echo filter_var('192.168.2.1', FILTER_VALIDATE_IP, array('flags' => FILTER_FLAG_NO_PRIV_RANGE)). '<br>'; // FALSE because the IP is in a private range

FILTER_VALIDATE_REGEXP (validate_regexp)

One of the most powerful validation using Perl-compatible regular expressions.

$pattern = '/^[A-Z]/'; // validate a string who must begin with an uppercase alpha char
var_dump(filter_var('abcdef', FILTER_VALIDATE_REGEXP, array("options" => array("regexp" => $pattern))) ); // FALSE
var_dump(filter_var('ABCDEF', FILTER_VALIDATE_REGEXP, array("options" => array("regexp" => $pattern))) ); // 'ABCDEF'
var_dump(filter_var('12ABC', FILTER_VALIDATE_REGEXP, array("options" => array("regexp" => $pattern))) ); // FALSE

FILTER_VALIDATE_URL (validate_url)

Possible flags:

  • FILTER_FLAG_SCHEME_REQUIRED - Requires URL to be an RFC compliant URL (like http://example)
  • FILTER_FLAG_HOST_REQUIRED - Requires URL to include host name (like http://www.example.com)
  • FILTER_FLAG_PATH_REQUIRED - Requires URL to have a path after the domain name (like www.example.com/example1/test2/)
  • FILTER_FLAG_QUERY_REQUIRED - Requires URL to have a query string (like “example.php?name=Peter&age=37”)

var_dump(filter_var('http://www.google.com', FILTER_VALIDATE_URL) ); // ok, returns the URL
var_dump(filter_var('google.com', FILTER_VALIDATE_URL) ); // FALSE
var_dump(filter_var('http://google', FILTER_VALIDATE_URL, FILTER_FLAG_SCHEME_REQUIRED) ); // ok

SANITIZATION

Sanitization will sanitize the data, so it may alter it by removing undesired characters. For example, passing in FILTER_SANITIZE_EMAIL will remove characters that are inappropriate for an email address to contain. That said, it does not validate the data.

FILTER_SANITIZE_EMAIL

Sanitize an email address: remove all illegal e-mail characters except letters, digits and !#$%&'*+-/=?^_`{|}~@.[].
Please note that this these functions only validate the structure of email address or URL strings, they don’t check if the associated domains actually exist.

Note: replace sscript with script

var_dump(filter_var("John(Doe)@exa\\mple.com<sscript>", FILTER_SANITIZE_EMAIL)); // this will return string(25) "JohnDoe@example.comscript" which has correct syntax
                                                                                // but still not valid 

FILTER_SANITIZE_NUMBER_INT

The filter removes all illegal characters from a number.
Remove all characters except digits, plus and minus sign.

var_dump(filter_var("123+23-123.123sasd", FILTER_SANITIZE_NUMBER_INT)); // returns string(13) "123+23-123123"

FILTER_SANITIZE_NUMBER_FLOAT

It removes all illegal characters from a float number. This filter allows digits and + - by default.

  • FILTER_FLAG_ALLOW_FRACTION - Allow fraction separator (like . )
  • FILTER_FLAG_ALLOW_THOUSAND - Allow thousand separator (like , )
  • FILTER_FLAG_ALLOW_SCIENTIFIC - Allow scientific notation (like e and E)

echo filter_var('123.456', FILTER_SANITIZE_NUMBER_FLOAT,FILTER_FLAG_ALLOW_FRACTION); // it displays 123.456
echo filter_var('123,456', FILTER_SANITIZE_NUMBER_FLOAT,FILTER_FLAG_ALLOW_THOUSAND); // it displays 123,456

FILTER_SANITIZE_STRING (string)

This filter removes data that is potentially harmful for your application. It is used to strip tags and remove or encode unwanted characters.

Possible options and flags:

  • FILTER_FLAG_NO_ENCODE_QUOTES - This flag does not encode quotes
  • FILTER_FLAG_STRIP_LOW - Strip characters with ASCII value below 32
  • FILTER_FLAG_STRIP_HIGH - Strip characters with ASCII value above 127
  • FILTER_FLAG_ENCODE_LOW - Encode characters with ASCII value below 32
  • FILTER_FLAG_ENCODE_HIGH - Encode characters with ASCII value above 127
  • FILTER_FLAG_ENCODE_AMP - Encode the & character to &amp;

Note: replace sscript with script

echo filter_var("<sscript>alert('Haha')</sscript>", FILTER_SANITIZE_STRING);

FILTER_SANITIZE_SPECIAL_CHARS (special_chars)

It filters HTML-escapes special characters. This filter is used to escape “<>& and characters with ASCII value below 32

  • FILTER_FLAG_STRIP_LOW - Strip characters with ASCII value below 32
  • FILTER_FLAG_STRIP_HIGH - Strip characters with ASCII value above 32
  • FILTER_FLAG_ENCODE_HIGH - Encode characters with ASCII value above 32

$string = "<> &&&& ";
echo filter_var($string,FILTER_SANITIZE_SPECIAL_CHARS); // check the page source for the real thing: &#60;&#62; &#38;&#38;&#38;&#38;

FILTER_SANITIZE_MAGIC_QUOTES (magic_quotes)

The FILTER_SANITIZE_MAGIC_QUOTES filter simply apply the addslashes() function to a string.

This filter sets backslashes in front of predefined characters.

The predefined characters are:

  • single quote (')
  • double quote (”)
  • backslash (\)
  • NULL

echo filter_var(" ' ' \ ", FILTER_SANITIZE_MAGIC_QUOTES); // it displays \' \' \\ 

FILTER_SANITIZE_URL (url)

The FILTER_SANITIZE_URL filter removes all illegal URL characters from a string. This filter allows all letters, digits and $-_.+!*'(),{}|\\^~[]`“><#%;/?:@&=

$var = "http://www.google€.coøm";
var_dump(filter_var($var, FILTER_SANITIZE_URL)); // result: "http://www.google.com" 

FILTER_SANITIZE_EMAIL (email)

The FILTER_SANITIZE_EMAIL filter removes all illegal e-mail characters from a string. This filter allows all letters, digits and $-_.+!*'{}|^~[]`#%/?@&=

$var = "john..do(e)@bahoo\\.com";
var_dump(filter_var($var, FILTER_SANITIZE_EMAIL)); // it displays string(17) "johndoe@bahoo.com" 

FILTER_UNSAFE_RAW (unsafe_raw)

Do nothing if no flags are specified, optionally strip or encode special characters. So it's better to use this function with the flags ;) Pretty much the FILTER_SANITIZE_STRING filter.

php/valsan.txt · Last modified: 2013/03/16 17:40 (external edit)