Hướng dẫn what is htmlentities ()?

I have seen a lot of conflicting answers about this. Many people love to quote that php functions alone will not protect you from xss.

Nội dung chính

  • Definition and Usage
  • Parameter Values
  • Technical Details
  • More Examples
  • What does Htmlspecialchars return?
  • What's the difference between HTML entities [] and htmlspecialchars []?
  • Does Htmlspecialchars prevent XSS?
  • What is use of HTML entities in PHP?

Nội dung chính

  • Definition and Usage
  • Parameter Values
  • Technical Details
  • More Examples
  • What does Htmlspecialchars return?
  • What's the difference between HTML entities [] and htmlspecialchars []?
  • Does Htmlspecialchars prevent XSS?
  • What is use of HTML entities in PHP?

What XSS exactly can make it through htmlspecialchars and what can make it through htmlentities?

I understand the difference between the functions but not the different levels of xss protection you are left with. Could anyone explain?

asked Sep 2, 2010 at 1:30

1

htmlspecialchars[] will NOT protect you against UTF-7 XSS exploits, that still plague Internet Explorer, even in IE 9: //securethoughts.com/2009/05/exploiting-ie8-utf-7-xss-vulnerability-using-local-redirection/

For instance:

Attack vector: alert[1]

This type of XSS can be sanitized using htmlspecialchars function because attacker need to use to create new HTML tag.

Solution:

  1. User input placed inside single quoted attribute:
'/>
  1. User input placed inside URL attributes: src, href, formaction, ...
    ']">JavaScript Window XSS

  1. User input placed inside JavaScript tag without any quote

  var inputNumber = 

Attack vector: 1;alert[1]

in some cases, we can easily quote input and prevent attack by sanitizing it using htmlspecialchars but if we need input to be integer we can prevent XSS by using input validation.

Solution:


  var inputNumber = 

Always quote variables when it placed inside a HTML attribute and do a proper sanitization.

❮ PHP String Reference

Nội dung chính

  • Definition and Usage
  • Parameter Values
  • Technical Details
  • More Examples
  • What does Htmlspecialchars return?
  • What's the difference between HTML entities [] and htmlspecialchars []?
  • Does Htmlspecialchars prevent XSS?
  • What is use of HTML entities in PHP?

Example

Convert the predefined characters "" [greater than] to HTML entities:

The HTML output of the code above will be [View Source]:




This is some <b>bold</b> text.

The browser output of the code above will be:

This is some bold text.

Try it Yourself »

Definition and Usage

The htmlspecialchars[] function converts some predefined characters to HTML entities.

The predefined characters are:

  • & [ampersand] becomes &
  • " [double quote] becomes "
  • ' [single quote] becomes '
  • < [less than] becomes <
  • > [greater than] becomes >

Tip: To convert special HTML entities back to characters, use the htmlspecialchars_decode[] function.

Syntax

htmlspecialchars[string,flags,character-set,double_encode]

Parameter Values

ParameterDescription
string Required. Specifies the string to convert
flags Optional. Specifies how to handle quotes, invalid encoding and the used document type.

The available quote styles are:

  • ENT_COMPAT - Default. Encodes only double quotes
  • ENT_QUOTES - Encodes double and single quotes
  • ENT_NOQUOTES - Does not encode any quotes

Invalid encoding:

  • ENT_IGNORE - Ignores invalid encoding instead of having the function return an empty string. Should be avoided, as it may have security implications.
  • ENT_SUBSTITUTE - Replaces invalid encoding for a specified character set with a Unicode Replacement Character U+FFFD [UTF-8] or &#FFFD; instead of returning an empty string.
  • ENT_DISALLOWED - Replaces code points that are invalid in the specified doctype with a Unicode Replacement Character U+FFFD [UTF-8] or &#FFFD;

Additional flags for specifying the used doctype:

  • ENT_HTML401 - Default. Handle code as HTML 4.01
  • ENT_HTML5 - Handle code as HTML 5
  • ENT_XML1 - Handle code as XML 1
  • ENT_XHTML - Handle code as XHTML
character-set Optional. A string that specifies which character-set to use.

Allowed values are:

  • UTF-8 - Default. ASCII compatible multi-byte 8-bit Unicode
  • ISO-8859-1 - Western European
  • ISO-8859-15 - Western European [adds the Euro sign + French and Finnish letters missing in ISO-8859-1]
  • cp866 - DOS-specific Cyrillic charset
  • cp1251 - Windows-specific Cyrillic charset
  • cp1252 - Windows specific charset for Western European
  • KOI8-R - Russian
  • BIG5 - Traditional Chinese, mainly used in Taiwan
  • GB2312 - Simplified Chinese, national standard character set
  • BIG5-HKSCS - Big5 with Hong Kong extensions
  • Shift_JIS - Japanese
  • EUC-JP - Japanese
  • MacRoman - Character-set that was used by Mac OS

Note: Unrecognized character-sets will be ignored and replaced by ISO-8859-1 in versions prior to PHP 5.4. As of PHP 5.4, it will be ignored an replaced by UTF-8.

double_encode Optional. A boolean value that specifies whether to encode existing html entities or not.
  • TRUE - Default. Will convert everything
  • FALSE - Will not encode existing html entities

Technical Details

Return Value:PHP Version:Changelog:
Returns the converted string

If the string contains invalid encoding, it will return an empty string, unless either the ENT_IGNORE or ENT_SUBSTITUTE flags are set

4+
PHP 5.6 - Changed the default value for the character-set parameter to the value of the default charset [in configuration].
PHP 5.4 - Changed the default value for the character-set parameter to UTF-8.
PHP 5.4 - Added ENT_SUBSTITUTE, ENT_DISALLOWED, ENT_HTML401, ENT_HTML5, ENT_XML1 and ENT_XHTML
PHP 5.3 - Added ENT_IGNORE constant.
PHP 5.2.3 - Added the double_encode parameter.
PHP 4.1 - Added the character-set parameter.

More Examples

Example

Convert some predefined characters to HTML entities:

The HTML output of the code above will be [View Source]:




I love "PHP".

The browser output of the code above will be:

I love "PHP".

Try it Yourself »

❮ PHP String Reference


What does Htmlspecialchars return?

The htmlspecialchars[] function returns the converted string.

What's the difference between HTML entities [] and htmlspecialchars []?

Difference between htmlentities[] and htmlspecialchars[] function: The only difference between these function is that htmlspecialchars[] function convert the special characters to HTML entities whereas htmlentities[] function convert all applicable characters to HTML entities.

Does Htmlspecialchars prevent XSS?

Using htmlspecialchars[] function – The htmlspecialchars[] function converts special characters to HTML entities. For a majority of web-apps, we can use this method and this is one of the most popular methods to prevent XSS. This process is also known as HTML Escaping.

What is use of HTML entities in PHP?

Definition and Usage The htmlentities[] function converts characters to HTML entities. Tip: To convert HTML entities back to characters, use the html_entity_decode[] function. Tip: Use the get_html_translation_table[] function to return the translation table used by htmlentities[].

Chủ Đề