Lunarpages Web Hosting Forum

Author Topic: How to execute query statement with some encoded utf8  (Read 8527 times)

Offline fates66

  • Intergalactic Cowboy
  • *****
  • Posts: 61
How to execute query statement with some encoded utf8
« on: October 15, 2009, 06:32:33 AM »
I have sample code as,
Code: [Select]
#!/usr/bin/perl
print "Content-type: text/html\n\n";
use DBI;
$dbdriver  = "mysql";
$hostname  = "localhost";
$port      = "3306";
$username  = "USERNAME";
$password  = "PASSWORD";
$basename  = "DBName";
$tablename = "TABLE NAME";
$dbh = DBI->connect("dbi:$dbdriver:$basename:$hostname:$port",$username, $password) || die print "Can't connect";
$sqlstatement = "SELECT id, idAct, name, tutor FROM $tablename";
$sth = $dbh->prepare($sqlstatement);
$sth->execute || die "Can't prepare statement: $DBI::errstr";
while (@row=$sth->fetchrow_array) {
print "@row[0], @row[1], @row[2], @row[3]<hr>";
}

I got the results with unreadable characters in name and tutor when visiting this Perl script above from browser.
Anyone help to solve the above. Thanks.

Offline MrPhil

  • Senior Moderator
  • Berserker Poster
  • *****
  • Posts: 6429
Re: How to execute query statement with some encoded utf8
« Reply #1 on: October 15, 2009, 08:59:14 AM »
When you say "unreadable characters", do you mean that something prints out, but it doesn't look like what you're expecting?

If this is your entire web page, the problem is that it's defaulting to Latin-1 encoding. Your data is presumably coming out of the database in UTF-8 encoding, and the characters are not converted to Latin-1 before being displayed. Assuming that there are non-Latin-1 characters to be shown, the easiest solution would be to output a full web page, including a <head> section that includes a
Code: [Select]
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />line. You may need to add a DOCTYPE tag, too. Anyway, the idea is to tell the browser that the text that's coming is UTF-8, not Latin-1.

On the other hand, if you know that all the UTF-8 characters coming out of the database are also found in Latin-1 (U+0000 through U+00FF), you might be able to dig up a translation routine to convert the UTF-8 text to Latin-1 before printing it, eliminating the need to declare the character encoding.
Visit My Site

E-mail Me
-= From the ashes shall rise a sooty tern =-

Offline fates66

  • Intergalactic Cowboy
  • *****
  • Posts: 61
Re: How to execute query statement with some encoded utf8
« Reply #2 on: October 15, 2009, 07:27:15 PM »
Thanks MrPhil. Yes, "unreadable character" as uou said. I follow your instructions and the result is the same.
I try to modeify the second statement  as
    print "Content-type: text/html; charset=utf-8\n\n";
also let file saved as utf-8 format. And the result is same as above.

 I explain more detailed as follows.
1. tables collation encoded as utf8_general_ci
2. some fields collation also encoded as utf8_general_ci

From phpMyAdmin here, I see anything displayed correctly. Also I retrieved any data via PHP and it run well. So what's I can do to solve the above problem. Thanks your comments again.

Offline MrPhil

  • Senior Moderator
  • Berserker Poster
  • *****
  • Posts: 6429
Re: How to execute query statement with some encoded utf8
« Reply #3 on: October 16, 2009, 08:06:15 AM »
Could I see the entire program now? Did the "Content-type..." show up on your screen, or no? If it did, it didn't "take" as a header. You might try splitting it into two Content-Type lines (one for text/html, the other for charset) and see if there's any difference. I don't use Perl much, so I can't confirm whether or not your Content-Type is correct. Maybe also try a capital "T" in "Type" -- sometimes browsers are picky about the case of header entries (Content-Type, not Content-type).
Visit My Site

E-mail Me
-= From the ashes shall rise a sooty tern =-

Offline fates66

  • Intergalactic Cowboy
  • *****
  • Posts: 61
Re: How to execute query statement with some encoded utf8
« Reply #4 on: October 16, 2009, 07:23:15 PM »
Attached code as following,

Code: [Select]
#!/usr/bin/perl
print "Content-Type: text/html\n\n";
print "<html>";
print "<head>";
print "<meta http-equiv=Content-Type content='text/html; charset=utf-8'>";
print "</head>";
print "<body>";

use DBI;
$dbdriver  = "mysql";
$hostname  = "localhost";
$port      = "3306";
$username  = "USERNAME";
$password  = "PASSWORD";
$basename  = "DBName";
$tablename = "TABLE NAME";
$dbh = DBI->connect("dbi:$dbdriver:$basename:$hostname:$port",$username, $password) || die print "Can't connect";
$sqlstatement = "SELECT id, idAct, name, tutor FROM $tablename WHERE idAct=2";
$sth = $dbh->prepare($sqlstatement);
$sth->execute() || die "Can't prepare statement: $DBI::errstr";
$item = $sth->{NAME};
$len = $sth->{NUM_OF_FIELDS};

# print out Field Name
for ($i=0; $i<$len; $i++) {
print $$item[$i], "\t\t";
}
print "<br>";

# print out separator
for ($i=0; $i<$len; $i++) {
print "=======\t\t";
}
print "<br>";
while (@row=$sth->fetchrow_array) {
print "@row[0]\t\t@row[1]\t\t@row[2]\t\t@row[3]<hr>";
}
print "</body>";
print "</html>";

And you could visit there from http://www.TaiwanCareCenter.org/cgi-bin/hello.pl. Thanks.

Offline MrPhil

  • Senior Moderator
  • Berserker Poster
  • *****
  • Posts: 6429
Re: How to execute query statement with some encoded utf8
« Reply #5 on: October 16, 2009, 08:19:02 PM »
Very strange. I can see that your script is writing the text as literal question marks (?), not as UTF-8 characters. Are these Chinese alphabet, rather than Latin-1? You said that phpMyAdmin shows the data correctly for these fields -- when you use the SQL tab to enter the
Code: [Select]
SELECT id, idAct, name, tutor FROM tablename WHERE idAct=2;line, do they display OK?

A few suggestions that might make debugging easier:
* add -w -T flags to the end of the first (#!) line, to warn about errors and problems
* move use DBI; up to the second line (after #! line)
* change the meta line to
Code: [Select]
print "<meta http-equiv='Content-Type' content='text/html; charset=UTF-8' />";* put \n at the end of each line (...\t\t@row[3]<hr>\n";)
No promises that this will fix the problem, but it might make it easier to see what's going on.

I'm not familiar with using Perl with non-Latin alphabets, but I'm wondering if you need to set something in Perl to tell it that you're using UTF-8 characters, possibly DBI->set_encoding(....) or something like that. That it's putting ??? into the HTML output leads me to wonder if these are 24-bit characters (UTF-8 encoding of CJK) that Perl simply doesn't know how to handle in strings. There's an I18N library module, but I don't know if it does anything more than collation control (for string comparisons). Do the number of ??? triplets match the expected number of CJK ideographs in the data? If they do, that would suggest that somewhere between MySQL and Perl handling this text, it's getting confused about the data and converting each character to ???. If that's what's happening, I'm going to have to defer to a Perl expert, as that's beyond my skill level.
Visit My Site

E-mail Me
-= From the ashes shall rise a sooty tern =-

Offline fates66

  • Intergalactic Cowboy
  • *****
  • Posts: 61
Re: How to execute query statement with some encoded utf8
« Reply #6 on: October 17, 2009, 02:16:01 AM »
Thank MrPhil I have fixed the above problem just add
Code: [Select]
{'RaiseError'=>1, 'mysql_enable_utf8'=>1} 
into DBI->connect as
Code: [Select]
$dbh = DBI->connect("dbi:$dbdriver:$basename:$hostname:$port",$username, $password, {'RaiseError'=>1, 'mysql_enable_utf8'=>1}) || die print "Can't connect";And it works!

Offline MrPhil

  • Senior Moderator
  • Berserker Poster
  • *****
  • Posts: 6429
Re: How to execute query statement with some encoded utf8
« Reply #7 on: October 17, 2009, 06:36:26 AM »
And it works!

Yeah! That's what counts! I hope I helped.
Visit My Site

E-mail Me
-= From the ashes shall rise a sooty tern =-