[PHP 4 >= 4.0.5, PHP 5, PHP 7, PHP 8]
iconv — Convert a string from one character encoding to another
Description
iconv[string $from_encoding
, string $to_encoding
, string $string
]:
string|false
Parameters
from_encoding
The current encoding used to interpret string
.
to_encoding
The desired encoding of the result.
If the string //TRANSLIT
is appended to to_encoding
, then transliteration is activated. This means that when a character can't be represented in the target charset, it may be approximated through one or several similarly looking characters. If the string //IGNORE
is
appended, characters that cannot be represented in the target charset are silently discarded. Otherwise, E_NOTICE
is generated and the function will return false
.
Caution
If and how //TRANSLIT
works exactly depends on the system's iconv[] implementation [cf. ICONV_IMPL
]. Some implementations are known to ignore //TRANSLIT
, so the conversion is likely to fail for characters which are illegal for the to_encoding
.
string
The string to be converted.
Return Values
Returns the converted string, or false
on failure.
Examples
Example #1 iconv[] example
The above example will output something similar to:
Original : This is the Euro symbol '€'. TRANSLIT : This is the Euro symbol 'EUR'. IGNORE : This is the Euro symbol ''. Plain : Notice: iconv[]: Detected an illegal character in input string in .\iconv-example.php on line 7
Notes
Note:
The character encodings and options available depend on the installed implementation of iconv. If the argument to
from_encoding
orto_encoding
is not supported on the current system,false
will be returned.
See Also
- mb_convert_encoding[] - Convert a string from one character encoding to another
- UConverter::transcode[] - Convert a string from one character encoding to another
orrd101 at gmail dot com ¶
10 years ago
The "//ignore" option doesn't work with recent versions of the iconv library. So if you're having trouble with that option, you aren't alone.
That means you can't currently use this function to filter invalid characters. Instead it silently fails and returns an empty string [or you'll get a notice but only if you have E_NOTICE enabled].
This has been a known bug with a known solution for at least since 2009 years but no one seems to be willing to fix it [PHP must pass the -c option to iconv]. It's still broken as of the latest release 5.4.3.
//bugs.php.net/bug.php?id=48147
//bugs.php.net/bug.php?id=52211
//bugs.php.net/bug.php?id=61484
[UPDATE 15-JUN-2012]
Here's a workaround...
ini_set['mbstring.substitute_character', "none"];
$text= mb_convert_encoding[$text, 'UTF-8', 'UTF-8'];
That will strip invalid characters from UTF-8 strings [so that you can insert it into a database, etc.]. Instead of "none" you can also use the value 32 if you want it to insert spaces in place of the invalid characters.
Ritchie ¶
15 years ago
Please note that iconv['UTF-8', 'ASCII//TRANSLIT', ...] doesn't work properly when locale category LC_CTYPE is set to C or POSIX. You must choose another locale otherwise all non-ASCII characters will be replaced with question marks. This is at least true with glibc 2.5.
Example:
daniel dot rhodes at warpasylum dot co dot uk ¶
11 years ago
Interestingly, setting different target locales results in different, yet appropriate, transliterations. For example:
annuaireehtp at gmail dot com ¶
12 years ago
to test different combinations of convertions between charsets [when we don't know the source charset and what is the convenient destination charset] this is an example :
then after displaying, you use the $i$j that shows good displaying.
NB: you can add other charsets to $tab to test other cases.
Daniel Klein ¶
2 years ago
If you want to convert to a Unicode encoding without the byte order mark [BOM], add the endianness to the encoding, e.g. instead of "UTF-16" which will add a BOM to the start of the string, use "UTF-16BE" which will convert the string without adding a BOM.
i.e.
I have found a lot of hints, suggestions and alternative methods [it's scary and in my opinion no good sign how many ways PHP natively provides to convert the encoding of strings], but none of them really worked, except for this one:
workaround suggested here and elsewhere will also break when encountering illegal characters, at least dropping a useful note ["htmlentities[]: Invalid multibyte sequence in argument in..."]
zhawari at hotmail dot com ¶
17 years ago
Here is how to convert UCS-2 numbers to UTF-8 numbers in hex:
regards,
Input:
06450631062D
Output:
D985D8B1D8AD
Ziyad
Leigh Morresi ¶
14 years ago
If you are getting question-marks in your iconv output when transliterating, be sure to 'setlocale' to something your system supports.
Some PHP CMS's will default setlocale to 'C', this can be a problem.
use the "locale" command to find out a list..
$ locale -a
C
en_AU.utf8
POSIX