You need to sign in to do that
Don't have an account?
Perl: UTF8 / Unicode Characters Do Not Arrive Correctly in Salesforce
Hi Guys,
I use WWW::Salesforce to create accounts, contacts and opportunities in salesforce. It all works nicely until I try to send customer names that use special characters other than ascii.
These when viewed in salesforce look as if the encoding is broken. AFAIK SOAP::Lite which WWW::Salesforce uses is utf8 per default (I tried setting it manually, no success).
Interstingly, when I use Encode::encode_utf8 on the strings I pass on to the hash (see below) Dumper prints the strings properly (and no, it does not work without that call as well). So I'm thinking, I got the strings properly utf8-ed. Still, no success on the salesforce side. Could it be that SOAP::Lite re-encodes them again although they are utf8? (if I leave them unencoded, the string that arrives is a different one but still broken). Here are the details:
Sent string (contact name): 2äöäöääö
Arrived string: 2äöäöääö
Dumper output:
$VAR1 = {
'Phone' => undef,
'MailingCountry' => undef,
'FirstName' => '1lölölööl',
'Login__c' => 'eetestu14',
'LastName' => '2äöäöääö',
'MailingStreet' => undef,
'Email' => 'XXXXXXXXX'
'type' => 'contact',
'AccountId' => 'XXXXXXXX',
};
Perl code:
use Encode;
my $contact =
{
'type'=> 'contact',
'AccountId' => $accountId,
'FirstName' => encode_utf8($user->getFirstName()),
'LastName' => encode_utf8($user->getLastName()),
'Login__c' => $user->getLogin(),
'Email' => $user->getEmail(),
'Phone' => $user->getContactPhone(),
'MailingStreet' => $user->getContactAddress1(),
'MailingCountry' => $user->getContactCountry(),
};
$res = $self->sf()->create(%$contact);
So, any ideas what I might be doing wrong?
Cheers,
Emil
OK, I've solved it. The solution, although a weird one is the following:
use Encode;
my $string = decode_utf8($user->getFirstName());
Yes it is DEcode, not encode. If I don't use decode (i.e. I leave it empty) it does NOT work. I'm to busy to research why the above works (our mysql's charset IS utf8).
The above $string when sent to salesforce arrives properly in salesforce (I tested with chinese/japanese/sanskrit and German).
Additionally, I found that I had to manually replace the ampersand (&) with (&) otherwise SOAP::Lite would barf (I would have expected this to happen somewhere in that API...)
All Answers
hmm sounds like a double encoding.
The question is then how the data is stored to your $user object. encode_utf8 will encode from iso-8859-1 to utf8. Be aware that iso-8859-1 is not extended ASCII. You might try a use-case where you "hard-code" the input-data. Here you should edit the data with an editor which can store data-files in various charsets (e.g. Eclipse).
best wishes
Carsten
Hi Carsten,
Thanks for the quick answer :).
I did try hardcoding the Umlauts (with and without encode_utf8()) surrounding them. It didn't work.
As to the charset of the $user strings, these come from a mysql db via our back-end. Our UI grabs these and displays them correctly (we do set the http content type to UTF-8 for our pages explicitly).
Hmm, I did use vim though (for the hard coding part). I might try another editor and save the file explicitly in utf8, that I'll try.
Any other ideas? Do you think I should mess with SOAP::Lite itself or the WWW::Salesforce module?
Cheers,
Emil
Emil,
try with the hard-coding first; vim is not the best choice here anyway :smileyhappy:
the perl modules should be fine. They should work with utf-8 ...
in case you might poke a bit around with database. the mysql-driver should allow to switch the result-data to utf8 also.
check for "set names 'UTF8' " and "set character_set_client =" , "set character_set_results =" and "set character_set_connection =" in the mysql docs. I assume that you run the mysql tables in latin1 (which is the standard), so firing the above set-commands might help.
Carsten Harnisch
-- InTradeSys Limited
OK, I've solved it. The solution, although a weird one is the following:
use Encode;
my $string = decode_utf8($user->getFirstName());
Yes it is DEcode, not encode. If I don't use decode (i.e. I leave it empty) it does NOT work. I'm to busy to research why the above works (our mysql's charset IS utf8).
The above $string when sent to salesforce arrives properly in salesforce (I tested with chinese/japanese/sanskrit and German).
Additionally, I found that I had to manually replace the ampersand (&) with (&) otherwise SOAP::Lite would barf (I would have expected this to happen somewhere in that API...)
I am getting same issue with my code, I am able to update campaign's every alternative update....... and you can see when I am passing the name of campaign that is Cell test2: âÎîÌÃé but as response when its coming I am getting the old campaign name Cell test1: âÎîÌÃé ...........
I tried to use decode_utf8 but its also failed..............and I am getting junk characters...............
Please help as soon as possible....
<soap:Body> <AddAdsRequest xmlns="https://adcenter.microsoft.com/v6"> <AdGroupId>51677679</AdGroupId>
<Ads> <Ad xsi:type="TextAd"> <DestinationUrl>{param1}</DestinationUrl>
<DisplayUrl>latin1test5.rtrk.com</DisplayUrl>
<Text>Latin1 Test5 Canoga Park, CA.</Text>
<Title>Cell test2: âÎîÌÃé </Title> </Ad> </Ads> </AddAdsRequest> </soap:Body>
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/"><s:Header><h:TrackingId xmlns:h="https://adcenter.microsoft.com/v6">8b4417fa-b66f-48e3-802b-18c1076f11a0</h:TrackingId></s:Header><s:Body><GetAdsByAdGroupIdResponse xmlns="https://adcenter.microsoft.com/v6"><Ads xmlns:i="http://www.w3.org/2001/XMLSchema-instance"><Ad i:type="TextAd"><EditorialStatus>Active</EditorialStatus><Id>762859</Id><Status>Active</Status><Type>Text</Type><DestinationUrl>{param1}</DestinationUrl><DisplayUrl>latin1test5.rtrk.com</DisplayUrl><Text>Latin1 Test5 Canoga Park, CA.</Text><Title>Cell test1: âÎîÌÃé</Title></Ad><Ad i:type="TextAd"><EditorialStatus>Active</EditorialStaId>762860</Id><Status>Active</Status><Type>Text</Type><DestinationUrl>{param1}</DestinationUrl><DisplayUrl>latin1test5.rtrk.com</DisplayUrl><Text>Latin1 Test5 Canoga Park, CA.</Text><Title>{keyword}</Title></Ad></Ads></GetAdsByAdGroupIdResponse></s:Body></s:Envelope>
Not sure if this will work for you but I was having an issue with UTF8 encoded unicode characters that not only did not arrive in Salesforce but were causing a parsing error in the SFDC API. English characters worked fine but when we had European language-specific characters, it was causing problems.
We're using an older version of SOAP::Lite so YMMV but wanted to share in case this was helpful. We decode_utf8 and then HTML numeric (hex) encode the result from that. In particular
my $string = 'Tesß ä'
$string = decode_utf8($string);
$string = HTML::Entities::encode_entities_numeric($string);