function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
cvsdudecvsdude 

Perl: UTF8 / Unicode Characters Do Not Arrive Correctly in Salesforce

Hi Guys,

I use WWW::Salesforce to create accounts, contacts and opportunities in salesforce. It all works nicely until I try to send customer names that use special characters other than ascii.

 

These when viewed in salesforce look as if the encoding is broken. AFAIK SOAP::Lite which WWW::Salesforce uses is utf8 per default (I tried setting it manually, no success).

 

Interstingly, when I use Encode::encode_utf8 on the strings I pass on to the hash (see below) Dumper prints the strings properly (and no, it does not work without that call as well). So I'm thinking, I got the strings properly utf8-ed. Still, no success on the salesforce side. Could it be that SOAP::Lite re-encodes them again although they are utf8? (if I leave them unencoded, the string that arrives is a different one but still broken). Here are the details:

 

Sent string (contact name): 2äöäöääö

 

Arrived string: 2äöäöääö

 

Dumper output:

$VAR1 = {
          'Phone' => undef,
          'MailingCountry' => undef,
          'FirstName' => '1lölölööl',
          'Login__c' => 'eetestu14',
          'LastName' => '2äöäöääö',
          'MailingStreet' => undef,
          'Email' => 'XXXXXXXXX'
          'type' => 'contact',
          'AccountId' => 'XXXXXXXX',

};

 

Perl code:

use Encode;

my $contact =
  {

    'type'=> 'contact',

    'AccountId' => $accountId,   

    'FirstName' => encode_utf8($user->getFirstName()),
    'LastName' => encode_utf8($user->getLastName()),
    'Login__c' => $user->getLogin(),
    'Email' => $user->getEmail(),
    'Phone' => $user->getContactPhone(),
    'MailingStreet' => $user->getContactAddress1(),
    'MailingCountry' => $user->getContactCountry(),
  };
 $res = $self->sf()->create(%$contact);

 

So, any ideas what I might be doing wrong?

 

Cheers,

Emil

Best Answer chosen by Admin (Salesforce Developers) 
cvsdudecvsdude

OK, I've solved it. The solution, although a weird one is the following:

 

use Encode;

 

my $string  = decode_utf8($user->getFirstName());

 

Yes it is DEcode, not encode. If I don't use decode (i.e. I leave it empty) it does NOT work. I'm to busy to research why the above works (our mysql's charset IS utf8).

 

The above $string when sent to salesforce arrives properly in salesforce (I tested with chinese/japanese/sanskrit and German).

 

Additionally, I found that I had to manually replace the ampersand (&) with (&) otherwise SOAP::Lite would barf (I would have expected this to happen somewhere in that API...) 

All Answers

CarstenHarCarstenHar

hmm sounds like a double encoding.

The question is then how the data is stored to your $user object. encode_utf8 will encode from iso-8859-1 to utf8. Be aware that iso-8859-1 is not extended ASCII. You might try a use-case where you "hard-code" the input-data. Here you should edit the data with an editor which can store data-files in various charsets (e.g. Eclipse).

 

best wishes

 

Carsten 

cvsdudecvsdude

Hi Carsten,

Thanks for the quick answer :).

 

I did try hardcoding the Umlauts (with and without encode_utf8()) surrounding them. It didn't work.

 

As to the charset of the $user strings, these come from a mysql db via our back-end. Our UI grabs these and displays them correctly (we do set the http content type to UTF-8 for our pages explicitly).

 

Hmm, I did use vim though (for the hard coding part). I might try another editor and save the file explicitly in utf8, that I'll try.

 

Any other ideas? Do you think I should mess with SOAP::Lite itself or the WWW::Salesforce module? 

 

Cheers,

Emil 

CarstenHarCarstenHar

Emil,

try with the hard-coding first; vim is not the best choice here anyway :smileyhappy:

the perl modules should be fine. They should work with utf-8 ...

 

in case you might poke a bit around with database. the mysql-driver should allow to switch the result-data to utf8 also.

 

check for "set names 'UTF8' " and "set character_set_client =" , "set character_set_results =" and "set character_set_connection =" in the mysql docs. I assume that you run the mysql tables in latin1 (which is the standard), so firing the above set-commands might help.

 

Carsten Harnisch

-- InTradeSys Limited 

cvsdudecvsdude

OK, I've solved it. The solution, although a weird one is the following:

 

use Encode;

 

my $string  = decode_utf8($user->getFirstName());

 

Yes it is DEcode, not encode. If I don't use decode (i.e. I leave it empty) it does NOT work. I'm to busy to research why the above works (our mysql's charset IS utf8).

 

The above $string when sent to salesforce arrives properly in salesforce (I tested with chinese/japanese/sanskrit and German).

 

Additionally, I found that I had to manually replace the ampersand (&) with (&) otherwise SOAP::Lite would barf (I would have expected this to happen somewhere in that API...) 

This was selected as the best answer
deb_butandeb_butan

 

I am getting same issue with my code, I am able to update campaign's every alternative update....... and you can see when I am passing the name of campaign that is Cell test2: âÎîÌÃé but as response when its coming I am getting the old campaign name  Cell test1: âÎîÌÃé ...........

I tried to use decode_utf8 but its also failed..............and I am getting junk characters...............

 

Please help as soon as possible....

 

 

 <soap:Body>    <AddAdsRequest xmlns="https://adcenter.microsoft.com/v6">      <AdGroupId>51677679</AdGroupId>
      <Ads>        <Ad xsi:type="TextAd">          <DestinationUrl>{param1}</DestinationUrl>
          <DisplayUrl>latin1test5.rtrk.com</DisplayUrl>
          <Text>Latin1 Test5 Canoga Park, CA.</Text>
          <Title>Cell test2: âÎîÌÃé </Title>        </Ad>      </Ads>    </AddAdsRequest>  </soap:Body>




<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/"><s:Header><h:TrackingId xmlns:h="https://adcenter.microsoft.com/v6">8b4417fa-b66f-48e3-802b-18c1076f11a0</h:TrackingId></s:Header><s:Body><GetAdsByAdGroupIdResponse xmlns="https://adcenter.microsoft.com/v6"><Ads xmlns:i="http://www.w3.org/2001/XMLSchema-instance"><Ad i:type="TextAd"><EditorialStatus>Active</EditorialStatus><Id>762859</Id><Status>Active</Status><Type>Text</Type><DestinationUrl>{param1}</DestinationUrl><DisplayUrl>latin1test5.rtrk.com</DisplayUrl><Text>Latin1 Test5 Canoga Park, CA.</Text><Title>Cell test1: âÎîÌÃé</Title></Ad><Ad i:type="TextAd"><EditorialStatus>Active</EditorialStaId>762860</Id><Status>Active</Status><Type>Text</Type><DestinationUrl>{param1}</DestinationUrl><DisplayUrl>latin1test5.rtrk.com</DisplayUrl><Text>Latin1 Test5 Canoga Park, CA.</Text><Title>{keyword}</Title></Ad></Ads></GetAdsByAdGroupIdResponse></s:Body></s:Envelope>

twongopenairtwongopenair

 

Not sure if this will work for you but I was having an issue with UTF8 encoded unicode characters that not only did not arrive in Salesforce but were causing a parsing error in the SFDC API. English characters worked fine but when we had European language-specific characters, it was causing problems.

 

We're using an older version of SOAP::Lite so YMMV but wanted to share in case this was helpful. We decode_utf8 and then HTML numeric (hex) encode the result from that. In particular 

 

my $string = 'Tesß ä'

$string = decode_utf8($string);

$string = HTML::Entities::encode_entities_numeric($string);