You need to sign in to do that
Don't have an account?
Shortest way to normalize strings - replace accents and non-ASCII chars with 'normal' chars
Hi,
Banks are really old-fashioned in the way they handle text data in files sent through even the latest standards-managing IT systems. In SEPA, for example, they still handle text in their own charset which is not Unicode, more similar to old ASCII or even EBCDIC :(
In the US, its not a big problem, but in European and Asian countries, this is mega-important.
In Java, the Normalize and Pattern classes are great for this : Unicode text can be normalized down to ASCII in 2 or 3 instructions. In Apex, things are *very* long.
This is the normalization I go through today :
private String clean(String in) { //String minmaj = "ÀÂÄÇÉÈÊËÎÏÛÜÔÖaàâäbcçdeéèêëfghiîïjklmnoôöpqrstuùûüvwxyz"; //String maj = "AAACEEEEIIUUOOAAAABCCDEEEEEFGHIIIJKLMNOOOPQRSTUUUUVWXYZ"; String acc = 'ÀÂÄÇÉÈÊËÎÏÌÛÜÙÔÖÒÑ' + '°()§<>%^¨*$€£`#,;./?!+=_@"' + '\''; // et Œ, Æ, &; String maj = 'AAACEEEEIIIUUUOOON' + ' ' + ' '; String out = ''; for (Integer i = 0 ; i < in.length() ; i++) { String car = in.substring(i, i+1); Integer idx = acc.indexOf(car); if (idx != -1){ out += maj.substring(idx, idx+1); } else if (car == 'Œ') { out += 'OE'; } else if (car == '&') { out += 'ET'; } else if (car == 'Æ') { out += 'AE'; } else { out += car; } } return out; }
Remember, this is to produce files where alignment is important, so I can't just replace Æ with AE without fixing the padding instructions (elsewhere).
This method uses too many instructions : has anyone got a better way of doing it, still in APEX ?
TIA,
Rup
Have you explored Matcher and Pattern (regular expressions) methods in Apex. If not, I would recommend exloring it to reduce the number of instructions.
http://www.salesforce.com/us/developer/docs/apexcode/index_Left.htm#CSHID=apex_classes_pattern_and_matcher_pattern_methods.htm|StartTopic=Content%2Fapex_classes_pattern_and_matcher_pattern_methods.htm|SkinName=webhelp
http://www.salesforce.com/us/developer/docs/apexcode/index_Left.htm#CSHID=apex_classes_pattern_and_matcher_pattern_methods.htm|StartTopic=Content%2Fapex_classes_pattern_and_matcher_pattern_methods.htm|SkinName=webhelp
I know Pattern and Matcher well.
What I really need is the equivalent of Java's Normalize to go with them : anyone got an idea ?
Rup
Thanks. Would love to see more formal support for string Normalization from Salesforce. Can't the java implementation be lifted??