Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Sorting strings with characters and numbers

Reply
Thread Tools

Sorting strings with characters and numbers

 
 
Carsten Zerbst
Guest
Posts: n/a
 
      08-13-2003
Hello,

I'd need to sort some strings using the order as given by Tcls lsort
command with -dictionary option:

% lsort -dictionary {a1 a2 a3 a4 a10 a20 a30}
a1 a2 a3 a4 a10 a20 a30


In java I get something like this

bsh % print(l);
[a1, a2, a3, a10, a20, a30]
bsh % Collections.sort(l);
bsh % print(l);
[a1, a10, a2, a20, a3, a30]
bsh %

I looked at the RuleBasedCollator but found to way to achive this
sorting. As this is a standard problem, there must be a solution
available somewhere ?

Thanks, Carsten

--
Dipl. Ing. Carsten Zerbst | http://www.velocityreviews.com/forums/(E-Mail Removed)
 
Reply With Quote
 
 
 
 
Marko Lahma
Guest
Posts: n/a
 
      08-13-2003
> bsh % print(l);
> [a1, a2, a3, a10, a20, a30]
> bsh % Collections.sort(l);
> bsh % print(l);
> [a1, a10, a2, a20, a3, a30]
> bsh %
>


The brute force way could be creating a java.util.Comparator for String
objects which could sort with your custom needs (RuleBasedCollator
implements it). The example you gave would be easy if all words just end
with numerical value.

I don't think RuleBasedCollator would be right solution anyways. Maybe
you could even port the tcl's lsort to java and share it!

-Marko

 
Reply With Quote
 
 
 
 
Roedy Green
Guest
Posts: n/a
 
      08-13-2003
On Wed, 13 Aug 2003 10:36:11 +0200, Carsten Zerbst
<(E-Mail Removed)> wrote or quoted :

>I'd need to sort some strings using the order as given by Tcls lsort
>command with -dictionary option:


A have no idea what a Tcl lsort is, but given your European name, I
will guess your problem is you need to sort alphabetically putting the
accented letters in a different place than Unicode would naturally
place them.


see http://mindprod.com/jgloss/sort.html

particularly the reference to java.text.Collator and
java.text.CollationKey
--
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
 
Reply With Quote
 
Roedy Green
Guest
Posts: n/a
 
      08-13-2003
On Wed, 13 Aug 2003 10:36:11 +0200, Carsten Zerbst
<(E-Mail Removed)> wrote or quoted :

>% lsort -dictionary {a1 a2 a3 a4 a10 a20 a30}
>a1 a2 a3 a4 a10 a20 a30


Perhaps what you really want to do is split each field in two, and
sort alphabetically on the alpha part and numerically on the numeric
part. It would be fastest to do this split before the sort starts
rather than on every compare.

--
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
 
Reply With Quote
 
Carsten Zerbst
Guest
Posts: n/a
 
      08-14-2003
Hello,

for the record, this is the Collator implementation I wrote.

Bye, Carsten
=================


public int compare( String source, String target ) {
// a tragical error in most code pages comes befor ,,,
// but must be sorted after sz. Replace it for comparison
// by sz
source = source.replaceAll( "", "ss" );
target = target.replaceAll( "", "ss" );

// equals ae
source = source.replaceAll( "", "ae" );
source = source.replaceAll( "", "Ae" );
target = target.replaceAll( "", "ae" );
target = target.replaceAll( "", "Ae" );

// equals oe
source = source.replaceAll( "", "oe" );
source = source.replaceAll( "", "Oe" );
target = target.replaceAll( "", "oe" );
target = target.replaceAll( "", "Oe" );

// equals ue
source = source.replaceAll( "\u00fc", "ue" );
source = source.replaceAll( "\u00dc", "Ue" );
target = target.replaceAll( "\u00fc", "ue" );
target = target.replaceAll( "\u00dc", "Ue" );


if ( source.equals( target ) ) {
return 0;
}

int index = -1;

// compare char by char until the first difference occures
int ls = source.length( );
int lt = target.length( );

while ( true ) {

// reached end of one string ?
if ( ++index > ls ) {
return -10 - index;
}

if ( index > lt ) {
return 10 + index;
}

// common substring ?
if ( !( source.substring( 0, index ).equals( target.substring( 0, index ) ) ) ) {
break;
}
}

index--;

//look at the remaining difference
char sDiffChar = source.charAt( index );
char tDiffChar = target.charAt( index );

// both are letters, compare using unicode
if ( Character.isLetter( sDiffChar ) && Character.isLetter( tDiffChar ) ) {
return ( sDiffChar < tDiffChar ) ? ( -100 ) : 100;
}

// one is digit, one is letter, digit first
if ( Character.isLetterOrDigit( sDiffChar ) && Character.isLetterOrDigit( tDiffChar ) ) {
return Character.isDigit( sDiffChar ) ? ( -1000 ) : 1000;
}

// both are digit, try to find the longest possible integers
if ( Character.isDigit( sDiffChar ) && Character.isDigit( tDiffChar ) ) {
StringBuffer sb = new StringBuffer( );
sb.append( sDiffChar );

StringBuffer tb = new StringBuffer( );
tb.append( tDiffChar );

boolean foundDigit = true;
while ( foundDigit ) {
foundDigit = false;
if ( Character.isDigit( source.charAt( ++index ) ) ) {
sb.append( source.charAt( index ) );
foundDigit = true;
}

if ( Character.isDigit( target.charAt( index ) ) ) {
tb.append( target.charAt( index ) );
foundDigit = true;
}
}

int snumber = Integer.parseInt( sb.toString( ) );
int tnumber = Integer.parseInt( tb.toString( ) );

return ( snumber < tnumber ) ? ( -10000 ) : 10000;
}

return -10000;
}


--
Dipl. Ing. Carsten Zerbst | (E-Mail Removed)
 
Reply With Quote
 
Roedy Green
Guest
Posts: n/a
 
      08-14-2003
On Thu, 14 Aug 2003 14:03:22 +0200, Carsten Zerbst
<(E-Mail Removed)> wrote or quoted :

> // compare char by char until the first difference occures
> int ls = source.length( );
> int lt = target.length( );


String.compareTo does this for you.

--
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
 
Reply With Quote
 
Roedy Green
Guest
Posts: n/a
 
      08-14-2003
On Thu, 14 Aug 2003 20:35:02 GMT, Roedy Green <(E-Mail Removed)>
wrote or quoted :

>
>> // compare char by char until the first difference occures
>> int ls = source.length( );
>> int lt = target.length( );

>
> String.compareTo does this for you.


retraction. You are doing something more complicated.

--
Canadian Mind Products, Roedy Green.
Coaching, problem solving, economical contract programming.
See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Sorting numbers as strings Jack Bauer Ruby 13 05-20-2009 06:04 AM
RE: sorting a list numbers stored as strings Delaney, Timothy (Tim) Python 4 09-25-2007 06:36 PM
sorting a list numbers stored as strings aine_canby@yahoo.com Python 6 09-25-2007 05:31 AM
Numbers to strings to numbers again one man army Javascript 6 12-30-2005 07:05 AM
Sorting of numbers or strings. Mars C Programming 3 03-02-2005 06:13 PM



Advertisments