Hướng dẫn php soundex vs metaphone - php soundex so với metaphone

(PHP 4, PHP 5, PHP 7, PHP 8)

Nội dung chính Show

Description
Return Values

soundex — Calculate the soundex key of a string

Description

soundex(string $string): string

Soundex keys have the property that words pronounced similarly produce the same soundex key, and can thus be used to simplify searches in databases where you know the pronunciation but not the spelling.

This particular soundex function is one described by Donald Knuth in "The Art Of Computer Programming, vol. 3: Sorting And Searching", Addison-Wesley (1973), pp. 391-392.

Parameters

string

The input string.

Return Values

Returns the soundex key as a string with four characters. If at least one letter is contained in string, the returned string starts with a letter. Otherwise "0000" is returned.

Changelog

Version	Description
8.0.0	Prior to this version, calling the function with an empty string returned false for no particular reason.

Examples

Example #1 Soundex Examples

soundex("Euler")       == soundex("Ellery");    // E460
soundex("Gauss")       == soundex("Ghosh");     // G200
soundex("Hilbert")     == soundex("Heilbronn"); // H416
soundex("Knuth")       == soundex("Kant");      // K530
soundex("Lloyd")       == soundex("Ladd");      // L300
soundex("Lukasiewicz") == soundex("Lissajous"); // L222
?>

/**
* A function for retrieving the Kölner Phonetik value of a string
*
* As described at http://de.wikipedia.org/wiki/Kölner_Phonetik
* Based on Hans Joachim Postel: Die Kölner Phonetik.
* Ein Verfahren zur Identifizierung von Personennamen auf der
* Grundlage der Gestaltanalyse.
* in: IBM-Nachrichten, 19. Jahrgang, 1969, S. 925-931
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* @package phonetics
* @version 1.0
* @link http://www.einfachmarke.de
* @license GPL 3.0
* @copyright 2008 by einfachmarke.de
* @author Nicolas Zimmer
*/function cologne_phon($word){/**
* @param string $word string to be analyzed
* @return string $value represents the Kölner Phonetik value
* @access public
*/

//prepare for processing

$word=strtolower($word);
$substitution=array(
"ä"=>"a",
"ö"=>"o",
"ü"=>"u",
"ß"=>"ss",
"ph"=>"f"
);

foreach (

$substitution as $letter=>$substitution) {
$word=str_replace($letter,$substitution,$word);
}$len=strlen($word);//Rule for exeptions
$exceptionsLeading=array(
4=>array("ca","ch","ck","cl","co","cq","cu","cx"),
8=>array("dc","ds","dz","tc","ts","tz")
);$exceptionsFollowing=array("sc","zc","cx","kx","qx");//Table for coding
$codingTable=array(
0=>array("a","e","i","j","o","u","y"),
1=>array("b","p"),
2=>array("d","t"),
3=>array("f","v","w"),
4=>array("c","g","k","q"),
48=>array("x"),
5=>array("l"),
6=>array("m","n"),
7=>array("r"),
8=>array("c","s","z"),
);

for (

$i=0;$i<$len;$i++){
$value[$i]="";//Exceptions
if ($i==0 AND $word[$i].$word[$i+1]=="cr") $value[$i]=4;

foreach (

$exceptionsLeading as $code=>$letters) {
if (in_array($word[$i].$word[$i+1],$letters)){$value[$i]=$code;

} }

if (

$i!=0 AND (in_array($word[$i-1].$word[$i],
$exceptionsFollowing))) {value[$i]=8;

}

//Normal encoding
if ($value[$i]==""){
foreach ($codingTable as $code=>$letters) {
if (in_array($word[$i],$letters))$value[$i]=$code;
}
}
}//delete double values
$len=count($value);

for (

$i=1;$i<$len;$i++){
if ($value[$i]==$value[$i-1]) $value[$i]="";
}//delete vocals
for ($i=1;$i>$len;$i++){//omitting first characer code and h
if ($value[$i]==0) $value[$i]="";
}$value=array_filter($value);
$value=implode("",$value);

return

$value;

}

Dirk Hoeschen - Feenders de ¶

8 years ago

I made some improvements to the "Cologne Phonetic" function of niclas zimmer. Key and value of the arrays are inverted to uses simple arrays instead of multidimensional arrays. Therefore all loops and iterations are not longer necessary to find the matching value for a char.
I put the function into a static class and moved the array declarations outside the function.

The result is more reliable and five times faster than the original.

class CologneHash() {

static

$eLeading = array("ca" => 4, "ch" => 4, "ck" => 4, "cl" => 4, "co" => 4, "cq" => 4, "cu" => 4, "cx" => 4, "dc" => 8, "ds" => 8, "dz" => 8, "tc" => 8, "ts" => 8, "tz" => 8);

& nbsp; & nbsp; tĩnh

$eFollow = array("sc", "zc", "cx", "kx", "qx");

& nbsp; & nbsp; tĩnh

$codingTable = array("a" => 0, "e" => 0, "i" => 0, "j" => 0, "o" => 0, "u" => 0, "y" => 0,
"b" => 1, "p" => 1, "d" => 2, "t" => 2, "f" => 3, "v" => 3, "w" => 3, "c" => 4, "g" => 4, "k" => 4, "q" => 4,
"x" => 48, "l" => 5, "m" => 6, "n" => 6, "r" => 7, "c" => 8, "s" => 8, "z" => 8);

& nbsp; & nbsp; chức năng tĩnh công khai

getCologneHash($word)
{
if (empty($word)) return false;
$len = strlen($word);

& nbsp; & nbsp; & nbsp; & nbsp; vì (

$i = 0; $i < $len; $i++) {
$value[$i] = "";//Exceptions
if ($i == 0 && $word[$i] . $word[$i + 1] == "cr") {
$value[$i] = 4;
}

& nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; Nếu (ISSET (

$word[$i + 1]) && isset(self::$eLeading[$word[$i] . $word[$i + 1]])) {
$value[$i] = self::$eLeading[$word[$i] . $word[$i + 1]];
}

& nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; nếu (

$i != 0 && (in_array($word[$i - 1] . $word[$i], self::$eFollow))) {
$value[$i] = 8;
}// normal encoding
if ($value[$i]=="") {
if (isset(self::$codingTable[$word[$i]])) {
$value[$i] = self::$codingTable[$word[$i]];
}
}
}// delete double values
$len = count($value);

& nbsp; & nbsp; & nbsp; & nbsp; vì (

$i = 1; $i < $len; $i++) {
if ($value[$i] == $value[$i - 1]) {
$value[$i] = "";
}
}// delete vocals
for ($i = 1; $i > $len; $i++) {
// omitting first characer code and h
if ($value[$i] == 0) {
$value[$i] = "";
}
}$value = array_filter($value);
$value = implode("", $value);

& nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; Nếu (ISSET (

$value;
}

& nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; nếu (

& nbsp; & nbsp; & nbsp; & nbsp; trở về ¶

}

synnus tại gmail dot com
// https://github.com/Fruneau/Fruneau.github.io/blob/master/assets/soundex_fr.php
// http://blog.mymind.fr/blog/2007/03/15/soundex-francais/
function soundex_fr($sIn){
static $convVIn, $convVOut, $convGuIn, $convGuOut, $accents;
if (!isset($convGuIn)) {
$accents = array('É' => 'E', 'È' => 'E', 'Ë' => 'E', 'Ê' => 'E',
'Á' => 'A', 'À' => 'A', 'Ä' => 'A', 'Â' => 'A', 'Å' => 'A', 'Ã' => 'A',
'Ï' => 'I', 'Î' => 'I', 'Ì' => 'I', 'Í' => 'I',
'Ô' => 'O', 'Ö' => 'O', 'Ò' => 'O', 'Ó' => 'O', 'Õ' => 'O', 'Ø' => 'O',
'Ú' => 'U', 'Ù' => 'U', 'Û' => 'U', 'Ü' => 'U',
'Ç' => 'C', 'Ñ' => 'N', 'Ç' => 'S', '¿' => 'E',
'é' => 'e', 'è' => 'e', 'ë' => 'e', 'ê' => 'e',
'á' => 'a', 'à' => 'a', 'ä' => 'a', 'â' => 'a', 'å' => 'a', 'ã' => 'a',
'ï' => 'i', 'î' => 'i', 'ì' => 'i', 'í' => 'i',
'ô' => 'o', 'ö' => 'o', 'ò' => 'o', 'ó' => 'o', 'õ' => 'o', 'ø' => 'o',
'ú' => 'u', 'ù' => 'u', 'û' => 'u', 'ü' => 'u',
'ç' => 'c', 'ñ' => 'n');
$convGuIn = array( 'GUI', 'GUE', 'GA', 'GO', 'GU', 'SCI', 'SCE', 'SC', 'CA', 'CO',
'CU', 'QU', 'Q', 'CC', 'CK', 'G', 'ST', 'PH');
$convGuOut = array( 'KI', 'KE', 'KA', 'KO', 'K', 'SI', 'SE', 'SK', 'KA', 'KO',
'KU', 'K', 'K', 'K', 'K', 'J', 'T', 'F');
$convVIn = array( '/E?(AU)/', '/([EA])?[UI]([NM])([^EAIOUY]|$)/', '/[AE]O?[NM]([^AEIOUY]|$)/',
'/[EA][IY]([NM]?[^NM]|$)/', '/(^|[^OEUIA])(OEU|OE|EU)([^OEUIA]|$)/', '/OI/',
'/(ILLE?|I)/', '/O(U|W)/', '/O[NM]($|[^EAOUIY])/', '/(SC|S|C)H/',
'/([^AEIOUY1])[^AEIOUYLKTPNR]([UAO])([^AEIOUY])/', '/([^AEIOUY]|^)([AUO])[^AEIOUYLKTP]([^AEIOUY1])/', '/^KN/',
'/^PF/', '/C([^AEIOUY]|$)/', '/E(Z|R)$/',
'/C/', '/Z$/', '/(?, '/H/', '/W/');
$convVOut = array( 'O', '1\3', 'A\1',
'E\1', '\1E\3', 'O',
'Y', 'U', 'O\1', '9',
'\1\2\3', '\1\2\3', 'N',
'F', 'K\1', 'E',
'S', 'SE', 'S', '', 'V');
}

7 năm trước

'E', 'è' => 'e', 'ë' => 'e', 'Ê' => 'e', & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; 'Á' => 'a', 'à' => 'a', 'ä' => 'a', 'â' => 'a', '' ' 'A', & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; 'Ï' => 'i', '' '=>' i ',' ì '=>' i ',' í '=>' i ', & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; 'Ô' => 'o', 'ö' => 'o', 'Ò' => 'o', 'Ó' => 'o', 'õ' => 'o', '' 'O', & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; 'Ú' => 'u', 'ù' => 'u', 'û' => 'u', 'ü' => 'u', & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; '. & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; 'é' => '' e ',' è '=>' e ',' ë '=>' e ',' Ê '=>' e ', & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; 'á' => 'a', 'à' => 'a', 'ä' => 'a', 'â' => 'a', '' ' 'A', & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; 'ï' => 'i', '' '=>' i ',' ì '=>' i ',' í '=>' i ', & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; 'ô' => 'o', 'ö' => 'o', 'Ò' => 'o', 'Ó' => 'o', 'õ' => 'o', '' 'o', & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; 'ú' => 'u', 'ù' => 'u', 'û' => 'u', 'ü' => 'u', & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; '. & nbsp; & nbsp; & nbsp; $ convguin & nbsp; = mảng ('gui', 'gue', 'ga', 'go', 'gu', 'sci', 'sce', 'sc', 'ca', 'co', & nbsp; & nbsp; & nbsp; & nbsp ; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; Ph '); & nbsp; & nbsp; & nbsp; & nbsp; $ convguout = mảng ('ki', 'ke', 'ka', 'ko', 'k', 'si', 'se', 'sk', 'ka', 'ko', & nbsp; & nbsp; & nbsp ; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; , 'F'); & nbsp; & nbsp; & nbsp; & nbsp; $ Convvin & nbsp; & nbsp; = mảng ('/e? (aU)/', '/([ea])? [ui] ([nm]) ([^eaioUy] | $)/', '/[ae] o? [nm] ([^Aeiouy] | $)/', & nbsp; ]? & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; Nm] ($ | [^eaouiy])/','/(sc | s | c) h/', & nbsp; & nbsp; & nbsp; '/([^aeioUy1]) [^ [^Aeiouy1])/','/^kn/', & nbsp; '/C ([^aeiouy] | $)/', & nbsp; '/e (z | r) $/', & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; '/c/', '/z $/', '/(?=== '' ) return ' ';
$sIn = strtr( $sIn, $accents);
$sIn = strtoupper( $sIn );
$sIn = preg_replace( '`[^A-Z]`', '', $sIn );
if ( strlen( $sIn ) === 1 ) return $sIn . ' ';
$sIn = str_replace( $convGuIn, $convGuOut, $sIn );
$sIn = preg_replace( '`(.)\1`', '$1', $sIn );
$sIn = preg_replace( $convVIn, $convVOut, $sIn);
$sIn = preg_replace( '`L?[TDX]?S?$`', '', $sIn );
$sIn = preg_replace( '`(?!^)Y([^AEOU]|$)`', '\1', $sIn);
$sIn = preg_replace( '`(?!^)[EA]`', '', $sIn);
return substr( $sIn . ' ', 0, 4);
}
?>

& nbsp; & nbsp; nếu ( ¶

$ sin === '') return '& nbsp; & nbsp; '; & nbsp; & nbsp; $ sin = strtr ($ sin, $ accent); & nbsp; & nbsp; $ sin = strtoupper ($ sin); & nbsp; & nbsp; $ sin = preg_replace ('`[^a-z]`', '', $ sin); & nbsp; & nbsp; if (strlen ($ sin) === 1) trả về $ sin. '& nbsp; & nbsp; '; & nbsp; & nbsp; $ sin = str_replace ($ convguin, $ convguout, $ sin); & nbsp; & nbsp; $ sin = preg_replace ('`(.) \ 1`', '$ 1', $ sin); & nbsp; & nbsp; $ sin = preg_replace ($ convvin, $ convvout, $ sin); & nbsp; & nbsp; $ sin = preg_replace ('`l? [tdx]? s? $`', '', $ sin); & nbsp; & nbsp; $ sin = preg_replace ('`(?!^) y ([^aeou] | $)`', '\ 1', $ sin); & nbsp; & nbsp; $ sin = preg_replace ('`(?!^) [ea]`', '', $ sin); & nbsp; & nbsp; trả về chuỗi con ($ sin. '& nbsp; & nbsp;', 0, 4);}?>

Cap tại Capsi Dot Cx ¶

22 năm trước ¶

SoundEx () thật không may là rất nhạy cảm với nhân vật đầu tiên. Không thể sử dụng nó và có Clansy và Klansy trả về cùng một giá trị. Nếu bạn muốn thực hiện tìm kiếm ngữ âm trên các tên như vậy, bạn vẫn sẽ cần viết một thói quen để đánh giá C452 tương tự như K452.

Dcallaghan tại Linuxmail Dot org ¶

20 năm trước

Chọn SoundEx ("Dostoyevski") trả về D2312Select Subring (SoundEx ("Dostoyevski"), 1, 4); trả về D231
returns D2312
select substring(soundex("Dostoyevski"), 1, 4);
returns D231

PHP sẽ trả về giá trị là 'D231'

Vì vậy, để sử dụng hàm SoundEx để tạo tham số WHERE trong câu lệnh Chọn MySQL, bạn có thể thử điều này: $ s = SoundEx ('Dostoyevski'); Chọn * từ các tác giả nơi con (SoundEx (LastName), 1, 4) = = "'. $ s & nbsp;.'" ';
$s = soundex('Dostoyevski');
SELECT * FROM authors WHERE substring(soundex(lastname), 1 , 4) = "' . $s . '"';

Hoặc, nếu bạn muốn bỏ qua hàm PHP $ result = mysql_query ("chọn soundex ('dostoyevski')"); $ s = mysql_result ($ result, 0, 0);
$result = mysql_query("select soundex('Dostoyevski')");
$s = mysql_result($result, 0, 0);

witold4249 tại Rogers dot com ¶ ¶

20 năm trước

Một cách dễ dàng hơn nhiều để kiểm tra sự tương đồng giữa các từ và tránh các vấn đề xảy ra với Klancy/Clancy sẽ chỉ đơn giản là thêm bất kỳ chữ cái nào của chuỗi

tức là: & nbsp; Oklancy/oclancy

Justin tại No dot blukrew dot spam dot com ¶ ¶

17 năm trước

Ban đầu tôi đã xem SOUNDEX () vì tôi muốn so sánh cách các chữ cái riêng lẻ. Vì vậy, khi phát âm một chuỗi các ký tự được tạo, sẽ dễ dàng phân biệt chúng với nhau. & NBSP; (tức là, TGDE khó phân biệt, trong khi RFQA dễ hiểu hơn). Mục tiêu là tạo ID có thể dễ dàng hiểu được với mức độ chính xác cao so với một đài phát thanh có chất lượng khác nhau. Tôi nhanh chóng nhận ra rằng SoundEx và Metaphone sẽ không làm điều này (họ làm việc cho các từ), vì vậy tôi đã viết như sau để giúp đỡ. Hàm tạo ID lặp lại gọi chrsoundalike () để so sánh từng ký tự mới với các ký tự tiền. Tôi quan tâm đến việc nhận bất kỳ phản hồi nào về điều này. Cảm ơn.

mảng ('a', 'j', 'k'), & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; 1 => mảng ('b', 'c', 'd', 'e', 'g', 'p', 't', 'v', 'z', '3'), & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; 2 => mảng ('f', 's', 'x'), & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; 3 => mảng ('i', 'y'), & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; 4 => mảng ('m', 'n'), & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; 5 => mảng ('q', 'u', 'w')); & nbsp; & nbsp; & nbsp; & nbsp; phá vỡ;
function chrSoundAlike($char1, $char2, $opts = FALSE) {
$char1 = strtoupper($char1);
$char2 = strtoupper($char2);
$opts = strtoupper($opts);// Setup the sets of characters that sound alike.
// (Options: include numbers, include W, include both, or default is none of those.)
switch ($opts) {
case 'NUMBERS':
$sets = array(0 => array('A', 'J', 'K'),
1 => array('B', 'C', 'D', 'E', 'G', 'P', 'T', 'V', 'Z', '3'),
2 => array('F', 'S', 'X'),
3 => array('I', 'Y'),
4 => array('M', 'N'),
5 => array('Q', 'U', 'W'));
break;

& nbsp; & nbsp; trường hợp

'STRICT':
$sets = array(0 => array('A', 'J', 'K'),
1 => array('B', 'C', 'D', 'E', 'G', 'P', 'T', 'V', 'Z'),
2 => array('F', 'S', 'X'),
3 => array('I', 'Y'),
4 => array('M', 'N'),
5 => array('Q', 'U', 'W'));
break;

case

'BOTH':
$sets = array(0 => array('A', 'J', 'K'),
1 => array('B', 'C', 'D', 'E', 'G', 'P', 'T', 'V', 'Z', '3'),
2 => array('F', 'S', 'X'),
3 => array('I', 'Y'),
4 => array('M', 'N'),
5 => array('Q', 'U', 'W'));
break;

default:

$sets = array(0 => array('A', 'J', 'K'),
1 => array('B', 'C', 'D', 'E', 'G', 'P', 'T', 'V', 'Z'),
2 => array('F', 'S', 'X'),
3 => array('I', 'Y'),
4 => array('M', 'N'),
5 => array('Q', 'U'));
break;
}// See if $char1 is in a set.
$matchset = array();
for ($i = 0; $i < count($sets); $i++) {
if (in_array($char1, $sets[$i])) {
$matchset = $sets[$i];
}
}// IF char2 is in the same set as char1, or if char1 and char2 and the same, then return true.
if (in_array($char2, $matchset) OR $char1 == $char2) {
return TRUE;
} else {
return FALSE;
}
}
?>

administrator at zinious dot com ¶

20 years ago

I wrote this function a long time ago in CGI-perl and then translated (if you can call it that) into PHP. A little clunky to say the least, but should handle true soundex specs 100%:

// ---begin code---

function MakeSoundEx($stringtomakesoundexof)
{
$temp_Name = $stringtomakesoundexof;
$SoundKey1 = "BPFV";
$SoundKey2 = "CSKGJQXZ";
$SoundKey3 = "DT";
$SoundKey4 = "L";
$SoundKey5 = "MN";
$SoundKey6 = "R";
$SoundKey7 = "AEHIOUWY";

$temp_Name = strtoupper($temp_Name);
$temp_Last = "";
$temp_Soundex = substr($temp_Name, 0, 1);

$n = 1;
for ($i = 0; $i < strlen($SoundKey1); $i++)
{
if ($temp_Soundex == substr($SoundKey1, i - 1, 1))
{
$temp_Last = "1";
}
}
for ($i = 0; $i < strlen($SoundKey2); $i++)
{
if ($temp_Soundex == substr($SoundKey2, i - 1, 1))
{
$temp_Last = "2";
}
}
for ($i = 0; $i < strlen($SoundKey3); $i++)
{
if ($temp_Soundex == substr($SoundKey3, i - 1, 1))
{
$temp_Last = "3";
}
}
for ($i = 0; $i < strlen($SoundKey4); $i++)
{
if ($temp_Soundex == substr($SoundKey4, i - 1, 1))
{
$temp_Last = "4";
}
}
for ($i = 0; $i < strlen($SoundKey5); $i++)
{
if ($temp_Soundex == substr($SoundKey5, i - 1, 1))
{
$temp_Last = "5";
}
}
for ($i = 0; $i < strlen($SoundKey6); $i++)
{
if ($temp_Soundex == substr($SoundKey6, i - 1, 1))
{
$temp_Last = "6";
}
}
for ($i = 0; $i < strlen($SoundKey6); $i++)
{
if ($temp_Soundex == substr($SoundKey6, i - 1, 1))
{
$temp_Last = "";
}
}

for ($n = 1; $n < strlen($temp_Name); $n++)
{
if (strlen($temp_Soundex) < 4)
{
for ($i = 0; $i < strlen($SoundKey1); $i++)
{
if (substr($temp_Name, $n - 1, 1) == substr($SoundKey1, $i - 1, 1) && $temp_Last != "1")
{
$temp_Soundex = $temp_Soundex."1";
$temp_Last = "1";
}
}
for ($i = 0; $i < strlen($SoundKey2); $i++)
{
if (substr($temp_Name, $n - 1, 1) == substr($SoundKey2, $i - 1, 1) && $temp_Last != "2")
{
$temp_Soundex = $temp_Soundex."2";
$temp_Last = "2";
}
}
for ($i = 0; $i < strlen($SoundKey3); $i++)
{
if (substr($temp_Name, $n - 1, 1) == substr($SoundKey3, $i - 1, 1) && $temp_Last != "3")
{
$temp_Soundex = $temp_Soundex."3";
$temp_Last = "3";
}
}
for ($i = 0; $i < strlen($SoundKey4); $i++)
{
if (substr($temp_Name, $n - 1, 1) == substr($SoundKey4, $i - 1, 1) && $temp_Last != "4")
{
$temp_Soundex = $temp_Soundex."4";
$temp_Last = "4";
}
}
for ($i = 0; $i < strlen($SoundKey5); $i++)
{
if (substr($temp_Name, $n - 1, 1) == substr($SoundKey5, $i - 1, 1) && $temp_Last != "5")
{
$temp_Soundex = $temp_Soundex."5";
$temp_Last = "5";
}
}
for ($i = 0; $i < strlen($SoundKey6); $i++)
{
if (substr($temp_Name, $n - 1, 1) == substr($SoundKey6, $i - 1, 1) && $temp_Last != "6")
{
$temp_Soundex = $temp_Soundex."6";
$temp_Last = "6";
}
}
for ($i = 0; $i < strlen($SoundKey7); $i++)
{
if (substr($temp_Name, $n - 1, 1) == substr($SoundKey7, $i - 1, 1))
{
$temp_Last = "";
}
}
}
}

& nbsp; & nbsp; while (strlen ($ temp_soundex) <4) & nbsp; & nbsp; {& nbsp; & nbsp; & nbsp; & nbsp; $ temp_soundex = $ temp_soundex. "0"; & nbsp; & nbsp; }
{
$temp_Soundex = $temp_Soundex."0";
}

& nbsp; & nbsp; trả về $ temp_soundex;}
}

// --- Mã kết thúc ---

crchafer-php tại c2se dot com ¶ ¶

16 năm trước

Viết lại, có thể - nhưng thuật toán có một số điều rõ ràng có thể được thực hiện, ví dụ ...
optimisations which can be done, for example...

& nbsp; & nbsp; & nbsp; & nbsp; Hàm Text__SoundEx ($ text) {& nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; $ k = '123 12 & nbsp; 22455 12623 1 2 2 '; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; $ nl = strlen ($ tn = strtoupper ($ text)); & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; $ p = trim ($ k {ord ($ ts = $ tn {0}) - 65}); & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; cho ($ n = 1; $ n <$ nl; ++ $ n) & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; if (($ l = trim ($ k {ord ($ tn {$ n}) - 65}))! = $ p) & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; $ ts. = ($ p = $ l); & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; & nbsp; trả về chuỗi con ($ ts. '000', 0, 4); & nbsp; & nbsp; & nbsp; & nbsp; }
$k = ' 123 12 22455 12623 1 2 2';
$nl = strlen( $tN = strtoupper( $text ) );
$p = trim( $k{ ord( $tS = $tN{0} ) - 65 } );
for( $n = 1; $n < $nl; ++$n )
if( ( $l = trim( $k{ ord( $tN{ $n } ) - 65 } ) ) != $p )
$tS .= ( $p = $l );
return substr( $tS . '000', 0, 4 );
}

// Ghi chú: // $ k là phím $, về cơ bản là $ soundkey đảo ngược // $ tn là chữ hoa của văn bản để được tối ưu hóa // $ ts là đầu ra được tạo một phần // $ l là chữ cái hiện tại, $ P P Các chỉ số lặp đi lặp lại // $ n và $ nL là các chỉ số lặp đi lặp
// $k is the $key, essentially $SoundKey inverted
// $tN is the uppercase of the text to be optimised
// $tS is the partaully generated output
// $l is the current letter, $p the previous
// $n and $nl are iteration indicies
// 65 is ord('A'), precalculated for speed
// none ascii letters are not supported
// watch the brackets, quite a mixture here

.
match the output of PHP's soundex(), speed untested --
though this should be /much/ faster than a4_perfect's
rewrite due to the removal of most loops and compares.)

C2005-09-13
2005-09-13

fie tại myrealbox dot com ¶

19 năm trước

Quản trị viên tại Zinious Dot Com:

Xin lỗi nhưng mã của bạn không tuân thủ SoundEx là kết quả của tôi với mã, mã của tôi và mặc định ..
here were my results with your code, my code, and the default..

Chuỗi: REST620 thực hiện chức năng của quản trị viên 0,009452R230 Thực hiện chức năng của CG 0,001779R230 Thực hiện chức năng âm thanh mặc định 9.499999999956E-005
R620 perform administrator's function 0.009452
R230 perform cg's function 0.001779
R230 perform default soundex function 9.4999999999956E-005

Chuỗi: Resetr620 thực hiện chức năng của quản trị viên 0,0055900000000001R230 Thực hiện chức năng của CG 0.0009179999999997R230 Thực hiện chức năng âm thanh mặc định 0.00010600000000005
R620 perform administrator's function 0.0055900000000001
R230 perform cg's function 0.00091799999999997
R230 perform default soundex function 0.00010600000000005

Tôi không biết tại sao mặc định, thỉnh thoảng, sẽ vì một số lý do là 9.xxx. Tôi nghĩ rất kỳ quặc. Mã của tôi ở dưới cùng .. Các thử nghiệm này là trước khi sửa đổi SoundEx vì tôi không biết bên dưới..BTW cho tất cả các thông số kỹ thuật gốc trên thuật toán SoundEx Gotohttp: //www.star-shine.net/~ functionifelse/gfd/? word = SoundEx
my code is at the bottom.. these tests were before the soundex modification as i discribe below..
btw for all the original specs on the soundex algorithm goto
http://www.star-shine.net/~functionifelse/GFD/?word=soundex

Dalibor Dot Toth tại Podravka Dot Hr:

Vâng, có lẽ thật đáng buồn khi nó cung cấp cho bạn cùng một mã, thậm chí metaphone cũng có vấn đề đó..nhưng người ta có thể không muốn chính xác như vậy .. nếu somoneis trên công cụ tìm kiếm .. hãy gọi nó là shmoogle tìm kiếm "thiết lập lại mảng PHP" và tìm kiếm Đối với "phần còn lại của mảng PHP" thì Shmoogle có thể trả lại những thứ về giường và như vậy .. (nếu tất cả đều ngu ngốc và không sử dụng những lời nói đầu tiên quan trọng hơn) vì vậy dù sao thì Shmoogle có thể cần nó kém chính xác hơn trong những trường hợp như vậy .. nhưng dù sao .. Bản sửa lỗi của tôi cho điều này là thêm số lượng âm tiết ở cuối chuỗi làm cho nó dài 5 ký tự..không này sẽ hoạt động như Fallows ..
even metaphone has that problem..
but one might not want to be so accurate.. if somone
is on search engine.. lets call it shmoogle looking
for "php array reset" and search for "php array rest"
then shmoogle might return stuff about beds and such..
(if they were all stupid and didnt use the first words
as more important) so anyways shmoogle might need it to
be less accurate in such cases.. but nonetheless..
my fix for this is to add the number of syllables at the end of the string making it 5 characters long..
this would work as fallows..

Mã tại: http://star-shine.net/~functionifelse/cg_soundex.php

Hoặc nếu bạn muốn chỉ sử dụng chức năng SoundEx mặc định

$ str = SoundEx ($ str) .cg_sylc ($ str);

mang tính cách mạng nhiều hay ít .. ít hơn ... chức năng này chỉ có ý nghĩa với một từ .. Tôi muốn thấy một số điều chỉnh nó để sử dụng Split và chạy nó qua một vòng lặp để có được từng từ CG_SoundExthat sẽ rất thú vị;) Tôi cũng muốn sujest đến php Zend Apache Kinda những người tạo phpto thêm một biến bổ sung tùy chọn mà người dùng có thể chỉ định là Fallows
This function is only meant for one word though.. i'd like to see someone
modify it to use split and run it through a loop to get each words cg_soundex
that'll be fun ;)
i would also like to sujest to the php zend apache kinda people who make php
to add an optional additional variable the user can specify as fallows

soundex("string",SYL);

Điều này sẽ trả về số lượng âm tiết ở cuối bài kiểm tra âm thanh chính xác của Stringhighly Woo! Ngoài ra, bạn có thể thêm lời thề cho các khuyết điểm nguyên âm cho phụ âm hoặc bất cứ điều gì khác mà ai đó muốn..nhưng tôi thực sự nghĩ rằng số lượng âm tiết sẽ có hiệu lực. Cuộc phiêu lưu .. Ồ ... và kết quả cuối cùng
highly accurate sound testing woo! also you could add VOW for vowels
and CONS for consonant or whatever else someone would want..
but i really think the number of syllables will be pleanty efficiant.
umm.. if this helps anyone your welcome.. ummm.. good luck in all
your php adventures.. oh... and the final results

âm tiết1 rest2 resetmetaphonerst resetrst resetSoundExR230 retter230 Đặt lại
1 rest
2 reset
metaphone
RST rest
RST reset
soundex
R230 rest
R230 reset

Chuỗi: REVERM2301 thực hiện chức năng của CG 0,00211R230 Thực hiện chức năng âm thanh mặc định 0,0001129999999997
R2301 perform cg's function 0.00211
R230 perform default soundex function 0.00011299999999997

Chuỗi: ResetR2302 Thực hiện chức năng của CG 0,001691R230 Thực hiện chức năng SoundEx mặc định 0.0001039999999999
R2302 perform cg's function 0.001691
R230 perform default soundex function 0.00010399999999999

Chức năng mặc định nhanh hơn một chút .. vì vậy có lẽ họ sẽ thêm tùy chọn này và chúng tôi sẽ có tốc độ và độ chính xác.
so maybe they will add this option and we'll have speed and accuracy.

Gió im lặng của Doom Woosh!

synnus tại gmail dot com ¶

2 năm trước

/* SOUNDEX FRENCH
Frederic Bouchery    26-Sep-2003
http://www.php-help.net/sources-php/a.french.adapted.soundex.289.html
*/function soundex2( $sIn ) {
   // Si il n'y a pas de mot, on sort immédiatement
   if ( $sIn === '' ) return ' ';
   // On met tout en minuscule
   $sIn = strtoupper( $sIn );
   // On supprime les accents
   $sIn = strtr( $sIn, 'ÂÄÀÇÈÉÊËŒÎÏÔÖÙÛÜ', 'AAASEEEEEIIOOUUU' );
   // On supprime tout ce qui n'est pas une lettre
   $sIn = preg_replace( '`[^A-Z]`', '', $sIn );
   // Si la chaîne ne fait qu'un seul caractère, on sort avec.
   if ( strlen( $sIn ) === 1 ) return $sIn . '   ';
   // on remplace les consonnances primaires
   $convIn = array( 'GUI', 'GUE', 'GA', 'GO', 'GU', 'CA', 'CO', 'CU',
'Q', 'CC', 'CK' );
   $convOut = array( 'KI', 'KE', 'KA', 'KO', 'K', 'KA', 'KO', 'KU', 'K',
'K', 'K' );
   $sIn = str_replace( $convIn, $convOut, $sIn );
   // on remplace les voyelles sauf le Y et sauf la première par A
   $sIn = preg_replace( '`(?, 'A', $sIn );
   // on remplace les préfixes puis on conserve la première lettre
   // et on fait les remplacements complémentaires
   $convIn = array( '`^KN`', '`^(PH|PF)`', '`^MAC`', '`^SCH`', '`^ASA`',
'`(?, '`(?, '`(?, '`(?,
'`(?);
   $convOut = array( 'NN', 'FF', 'MCC', 'SSS', 'AZA', 'NN', 'FF', 'MCC',
'SSS', 'AZA' );
   $sIn = preg_replace( $convIn, $convOut, $sIn );
   // suppression des H sauf CH ou SH
   $sIn = preg_replace( '`(?, '', $sIn );
   // suppression des Y sauf précédés d'un A
   $sIn = preg_replace( '`(?, '', $sIn );
   // on supprime les terminaisons A, T, D, S
   $sIn = preg_replace( '`[ATDS]$`', '', $sIn );
   // suppression de tous les A sauf en tête
   $sIn = preg_replace( '`(?!^)A`', '', $sIn );
   // on supprime les lettres répétitives
   $sIn = preg_replace( '`(.)\1`', '$1', $sIn );
   // on ne retient que 4 caractères ou on complète avec des blancs
   return substr( $sIn . ' ', 0, 4);
}
?>

Ẩn danh ¶ ¶

16 năm trước

Vì chữ cái đầu tiên được bao gồm trong biểu diễn ngữ âm trong đầu ra, nên chỉ ra rằng nếu bạn muốn một khóa SoundEx hoạt động mà không có vấn đề về Klansy và Clansy nghe có vẻ khác nhau, hãy lấy phần phụ từ chữ cái đầu tiên, làm chữ cái đầu tiên là hằng số chính của từ và giá trị số là cấu trúc phontic của từ.

Thư tại gettheawayspam dot iaindooley dot com ¶ ¶

19 năm trước

Có thể giải quyết vấn đề 'chữ cái khác nhau của SoundEx bằng cách sử dụng levenshtein () trên mã soundex. Trong ứng dụng của tôi, đang tìm kiếm một cơ sở dữ liệu về tên album cho các mục phù hợp với một chuỗi người dùng cụ thể, tôi thực hiện như sau:

1. Tìm kiếm cơ sở dữ liệu cho tên chính xác2. Tìm kiếm cơ sở dữ liệu cho các mục nhập nơi tên xảy ra dưới dạng String3. Tìm kiếm cơ sở dữ liệu cho các mục nhập trong đó bất kỳ từ nào trong tên (nếu người dùng đã nhập nhiều hơn một từ), ngoại trừ các từ nhỏ (và,, của v.v.) 4. Sau đó, nếu tất cả điều này thất bại, tôi đi lên kế hoạch B:
2. Search the database for entries where the name occurs anyway as a string
3. Search the database for entries where any of the words in the name (if the user has typed in more than one word) is present, except for little words (and, the, of etc)
4. Then, if all this fails, I go to plan b:

- Tính khoảng cách Levenshtein giữa các mã metphone của thuật ngữ tìm kiếm người dùng được nhập và mỗi trường trong cơ sở dữ liệu theo phần trăm độ dài của mã metaphone của thuật ngữ tìm kiếm người dùng được nhập

- Tính khoảng cách Levenshtein giữa các mã SoundEx của thuật ngữ tìm kiếm người dùng được nhập và mỗi trường trong cơ sở dữ liệu theo tỷ lệ phần trăm của mã SOUNCEX của thuật ngữ tìm kiếm người dùng gốc được nhập

Nếu bất kỳ tỷ lệ phần trăm nào trong số này là nhỏ hơn 50 (có nghĩa là hai mã soundex với các chữ cái đầu tiên khác nhau sẽ được chấp nhận !!) thì mục nhập được chấp nhận là một trận đấu có thể.

fie tại myrealbox dot com ¶

19 năm trước

- Tính khoảng cách levenshtein (levenshtein ()) giữa thuật ngữ tìm kiếm người dùng và từng mục trong cơ sở dữ liệu theo tỷ lệ phần trăm của độ dài của thuật ngữ tìm kiếm người dùng đã nhập
$nos = str_replace(array('AA','AE','AI','AO','AU',
'EA','EE','EI','EO','EU','IA','IE','II','IO',
'IU','OA','OE','OI','OO','OU','UA','UE',
'UI','UO','UU'), "", $nos);
$after = strlen($nos);
$diference = $before - $after;
if($before != $after) $syllables += $diference / 2;

Nếu bất kỳ tỷ lệ phần trăm nào trong số này là nhỏ hơn 50 (có nghĩa là hai mã soundex với các chữ cái đầu tiên khác nhau sẽ được chấp nhận !!) thì mục nhập được chấp nhận là một trận đấu có thể.
}

fie tại myrealbox dot com
$syl = cg_sylc($SExStr);
$SExStr = strtoupper($SExStr);

eek ... lưu trữ đã được đưa xuống máy chủ đó .. đây là mã cho trước đó

hàm cg_sylc ($ nos) {& nbsp; $ nos = strtouper ($ nos); & nbsp; $ âm tiết = 0;
$tsstr .= $SExStr[$ii];
$i ++;
}
if($SExStr[$ii] == false){
break;
}
}

& nbsp; $ trước = strlen ($ nos); & nbsp; $ nos = str_replace (mảng ('aa', 'ae', 'ai', 'ao', 'au', & nbsp; 'ea', 'ee', 'ei', 'eo', 'eu', ' Ia ',' tức là ',' ii ',' io ', & nbsp;' iu ',' oa ',' oe ',' oi ',' oo ',' ou ',' ua ',' ue ', & nbsp ; 'Ui', 'uo', 'uu'), "", $ nos); & nbsp; $ after = strlen ($ nos); & nbsp; $ diference = $ trước - $ sau; & nbsp; if ($ trước! = $ sau) $ âm tiết += $ diference / 2;
$tsstr = str_replace(array('B', 'F', 'P', 'V'), "1", $tsstr);
$tsstr = str_replace(array('C', 'G', 'J', 'K', 'Q', 'S', 'X', 'Z', '?'), "2", $tsstr);
$tsstr = str_replace(array('D', 'T'), "3", $tsstr);
$tsstr = str_replace(array('L'), "4", $tsstr);
$tsstr = str_replace(array('M', 'N', '?'), "5", $tsstr);
$tsstr = str_replace(array('R'), "6", $tsstr);

& nbsp; if ($ nos [strlen ($ nos) -1] == "e") $ âm tiết-; & nbsp; if ($ nos [strlen ($ nos) -1] == "y") $ âm tiết ++;
if($tsstr[$iii] != false){
$ttsstr .= $tsstr[$iii];
} else {
$ttsstr .= "0";
}
$iii ++;
}
$ttsstr .= $syl;
print $ttsstr;
}

& nbsp; $ trước = $ sau; & nbsp; $ nos = str_replace (mảng ('a', 'e', 'i', 'o', 'u'), "", $ nos); & nbsp; $ after = strlen ($ nos); & nbsp; $ âm tiết += ($ trước - $ sau); ¶

19 năm trước

Ẩn danh ¶ ¶

16 năm trước

Thư tại gettheawayspam dot iaindooley dot com ¶
Clancy => LClancy

19 năm trước ¶

19 năm trước

1. Tìm kiếm cơ sở dữ liệu cho tên chính xác2. Tìm kiếm cơ sở dữ liệu cho các mục nhập nơi tên xảy ra dưới dạng String3. Tìm kiếm cơ sở dữ liệu cho các mục nhập trong đó bất kỳ từ nào trong tên (nếu người dùng đã nhập nhiều hơn một từ), ngoại trừ các từ nhỏ (và,, của v.v.) 4. Sau đó, nếu tất cả điều này thất bại, tôi đi lên kế hoạch B:
$sql = "SELECT * FROM table WHERE substring(soundex(field), 1, 4) = substring(soundex('".$wordsearch."'), 1, 4)";

Pee Whitt tại nha khoa dot ufl dor edu ¶ ¶

19 năm trước

fie tại myrealbox dot com-

Về yêu cầu âm tiết Soudex của bạn- Tôi nghĩ rằng việc đếm các cụm nguyên âm trong từ sẽ dẫn đến số lượng âm tiết chính xác. & NBSP; Vì vậy, không cần tính năng SoudEx, chỉ cần đếm qua các ký tự trong từ và mỗi khi bạn chạy từ nguyên âm đến Consanant, tăng số âm tiết.

Sử dụng logic này, câu này được phân loại như sau.2 1 2 1 1 (3) (0) (4) (0) 2
2 1 2 1 1 (3) (0) (4) (0) 2

trong đó (#) đánh dấu một từ được phân loại không chính xác. & nbsp; Tôi chắc chắn rằng Usiong một chút suy nghĩ người ta có thể tìm ra logic trong những trường hợp đó sẽ dẫn đến số lượng chính xác. & NBSP; Đếm các thay đổi từ nguyên âm thành Consanant sẽ mang lại- (1) 1 2 1 2 1 (4) 1 2
(1) 1 2 1 2 1 (4) 1 2

Lấy trung bình và sau đó cieling của hai loại sẽ khắc phục hầu hết các lỗi.

đường tắt ¶ ¶

15 năm trước

Câu trả lời cho dù SoundEx có hoạt động ngoại trừ chữ cái đầu tiên trong Klancy vs Clancy là luôn luôn có tiền tố các từ có cùng một chữ cái.

Aklancy sẽ phù hợp
bklancy will match bclancy

SoundEx dường như chỉ kiểm tra các âm tiết 1 1st .??ie: Trận đấu ngoạn mục
ie: spectacular matches spectacle

Chỉ là một suy nghĩ nếu bạn dựa vào SoundEx.

programming php

Hướng dẫn php soundex vs metaphone - php soundex so với metaphone

Description

Parameters

Return Values

Changelog

Examples

See Also

Bài Viết Liên Quan

Quảng Cáo

Có thể bạn quan tâm

Toplist được quan tâm

Quảng cáo

Xem Nhiều

Quảng cáo

Chúng tôi

Điều khoản

Trợ giúp

Mạng xã hội