Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > sorting the input

Reply
Thread Tools

sorting the input

 
 
Ben Bacarisse
Guest
Posts: n/a
 
      04-24-2008
arnuld <(E-Mail Removed)> writes:

>> On Wed, 23 Apr 2008 13:17:26 +0100, Ben Bacarisse wrote:

>
>>> arnuld <(E-Mail Removed)> writes:

>
>>> okay, I am done with reading and printing lines now. It is working
>>> now,

>
>> Sorry, not quite.

>
> I have shown you the output and it seems ok to me. Did you find something
> wrong ?


Yes. Not in the sense of "it won't work" but wrong in the sense of
not right! If I do this:

#define MAX_SIZE 1000
...
char line[MAX_SIZE];
...
if (fgets(line, 902, fp) != NULL) ...

is it wrong? I think so, but the code will compile and execute quite
safely. That is all I was pointing out (see below).

>>> I need to understand the sorting part now:

>
>> I think you need to show your best attempt at the sort call and people
>> will help you get the details right.

>
>
> I can't understand the 13.8 of FAQs: http://c-faq.com/lib/qsort1.html
>
> /* compare strings via pointers */
> int pstrcmp(const void *p1, const void *p2)
> {
> return strcmp(*(char * const *)p1, *(char * const *)p2);
> }
>
> parameters are made void* but in fact we are passing char**, which is 2 levels of
> indirection. and what about the cast:
>
> (char* const*)p1
>
> we are casting a <pointer to void> to <pointer to a const pointer to char>
> , right ?


Yup. qsort is given an array to sort. All it know about this data is
how big each element is and how many there are:

+----------+----------+----------+----------+----
base-----> | element1 | element2 | element3 | element4 | ...
+----------+----------+----------+----------+----

If qsort needs to compare element1 with element4, it passes a pointer
to these elements to the comparison function. These pointers must be
void * because qsort has no type information at all. Your comparison
function gets two pointers like this:

+----------+----------+----------+----------+----
base-----> | element1 | element2 | element3 | element4 | ...
+----------+----------+----------+----------+----
^ ^
| |
+---------------+ |
| |
int compare(const void *p1, const void * p2)

In your case, the elements are actually char *s -- they point to
strings like this:

+---+---+---+---+---+ +---+---+---+---+---+
| a | b | c | \n| \0| | d | e | f | \n| \0|
+---+---+---+---+---+ +---+---+---+---+---+
^ ^
| |
+---|------+----------+----------+---|------+----
base-----> | element1 | element2 | element3 | element4 | ...
+----------+----------+----------+----------+----
^ ^
| |
+---------------+ |
| |
int compare(const void *p1, const void * p2)

so although p1 is declared void * what it *really* is is a char ** --
it points to a char *. To get at the strings to compare you must
convert p1 to a char ** and access the char * it finds there (by
applying the * operator to the result). We make everything const, but
just blank that out to follow the intent:

int compare(const void *p1, const void * p2)
{
const char *const *cp1 = p1; /* char **cp1 = p1; in effect */
const char *const *cp2 = p2;
return strcmp(*cp1, *cp2);
}

*cp1 is 'element1' -- a char * pointing at "abc\n" and *cp2 is a char
pointing at "def\n". The code in the FAQ does the same (with one less
const) all in a single expression. I prefer to avoid the cast. Does
that help?

>>> while( fgets(temp_arr, max, stdin) && num_lines < max )

>> ^^^?
>>
>> Safe (because max happens to be less that STR_SIZE) but not correct.

>
> incorrect ?


Yes, you pass ARR_SIZE as the max parameter, but temp_arr is of size
STR_SIZE. Now, as it happens, max is smaller than STR_SIZE so this is
safe but fgets shoudl be passed the size of the thing is is to use,
not some other size that relates to something else!

>>> if( (p = malloc( size_arr * sizeof( char ))) )

>
>> You can omit sizeof(char). It is 1 by definition.

>
> you mean, by default, if I type: <malloc( size_arr )> then it will be
> converted to <malloc(size_arr * 1)> ?


Not really. If you write x it is not converted to x * 1 even though
they are the same. If you write malloc(size_arr) you get size_arr
bytes. Multiplying by sizeof(char) is pointless because sizeof(char)
is 1. Some people like to write it out because it reminds them that
are allocating space for chars, but I prefer not to do that.

>> I can see why some
>> people might want a size there, but if you are one of them then you
>> should use the c.l.c-approved idiom:
>>
>> if( (p = malloc(size_arr * sizeof *p)) )

>
> okay


--
Ben.
 
Reply With Quote
 
 
 
 
Ben Bacarisse
Guest
Posts: n/a
 
      04-24-2008
arnuld <(E-Mail Removed)> writes:

> I have tried the strcmp function from FAQ:
>
> http://c-faq.com/lib/qsort1.html
>
> it fails to do its job. It Segfaults . If I remove its call from the
> program, my program compiles and runs fine:


How many elements are you sorting?

--
Ben.
 
Reply With Quote
 
 
 
 
Ben Bacarisse
Guest
Posts: n/a
 
      04-24-2008
pete <(E-Mail Removed)> writes:

> arnuld wrote:
>
>> qsort( arr_of_ptr, ARR_SIZE, sizeof( char* ), p_strcmp );

>
>> /* compare 2 strings using pointers */
>> int p_strcmp( const void* pv1, const void* pv2 )
>> {
>> return strcmp( *(char* const*)pv1, *(char* const*)pv2 );
>> }

>
>
> int p_strcmp(const void *pv1, const void *pv2)
> {
> return strcmp(pv1, pv2);
> }


Eh? The array being sorted is an array of char *s.

--
Ben.
 
Reply With Quote
 
Ben Bacarisse
Guest
Posts: n/a
 
      04-24-2008
Richard Heathfield <(E-Mail Removed)> writes:

> Richard Tobin said:
>
> <snip>
>
>> In effect, [sizeof] instructs the compiler to find the number of bytes
>> used by the argument. That depends on the *type* of the argument,
>> not its value. And you don't need to evaluate an expression to
>> determine its type: if a and b are ints, a+b is an int regardless
>> of the values of a and b; if x is an array of doubles, x[5] is always
>> a double, and so on.
>>
>> The compiler does some calculations to determine the type and hence
>> the size, so you could say it evaluates *something*, but it doesn't
>> evaluate the argument itself.

>
> s/argument/operand/g
>
> Sorry to be picky, but too few people realise that sizeof is an operator
> rather than a function, so I think it's better to be precise in this
> situation. (Yes, I know that you know. Just a typo, understood.)


That may be being a bit over-picky. The distinction is such a fine
one that the standard gets it wrong. Although the definition of
"argument" states that these are what get passed in function calls,
the text later refers to operators taking arguments.

--
Ben.
 
Reply With Quote
 
arnuld
Guest
Posts: n/a
 
      04-24-2008
> On Wed, 23 Apr 2008 13:17:26 +0100, Ben Bacarisse wrote:

>> arnuld wrote:
>> size_arr = strlen( temp_arr ) + 1;
>> if( (p = malloc( size_arr * sizeof( char ))) )


> You can omit sizeof(char). It is 1 by definition. I can see why some
> people might want a size there, but if you are one of them then you
> should use the c.l.c-approved idiom:
>
> if( (p = malloc(size_arr * sizeof *p)) )


where is "p" pointing to ? nowhere . so you can't dereference it yet.


--
http://lispmachine.wordpress.com/
my email ID is at the above address

 
Reply With Quote
 
arnuld
Guest
Posts: n/a
 
      04-24-2008
I have tried the strcmp function from FAQ:

http://c-faq.com/lib/qsort1.html

it fails to do its job. It Segfaults . If I remove its call from the
program, my program compiles and runs fine:


/* write a program to read a set of lines from input and sort them
* and then print them.
*
* version 1.0
*/


#include <stdio.h>
#include <stdlib.h>
#include <string.h>

enum MAXLINES { ARR_SIZE = 100, STR_SIZE = 1000 };

char* arr_of_ptr[ARR_SIZE];
/* char arr_of_char[STR_SIZE]; */

int readlines( char**, const int );
void printlines( char** );
int p_strcmp( const void*, const void* );


/* main() will simply call the other functions to do the job */
int main( void )
{
if( readlines( arr_of_ptr, ARR_SIZE ) > 0 )
{
qsort( arr_of_ptr, ARR_SIZE, sizeof( char* ), p_strcmp );
printlines( arr_of_ptr );
}


return 0;
}


/* 1) read lines till we get the NULL,
* 2) store those lines into an array of characters <arr_of_lines>,
* 3) pointer of arry of pointers <arr_of_ptr> will point to the
* individual elements of array of characters <arr_of_lines>,
*
*/

int readlines( char* arr_of_ptr[], const int max )
{
char *p, **p_arrptr;
int num_lines, size_arr;
char temp_arr[STR_SIZE];

num_lines = 0;
p_arrptr = arr_of_ptr;

while( fgets(temp_arr, max, stdin) && num_lines < max )
{
size_arr = strlen( temp_arr ) + 1;
if( (p = malloc( size_arr * sizeof( char ))) )
{
strcpy( p, temp_arr );
*p_arrptr++ = p;
++num_lines;
}

}

return num_lines;
}



/* it will simply print the lines pointed to by the elements of
* arrays of pointers <arr_of_ptr>.
*
*/
void printlines( char* arr_of_ptr[] )
{
printf("\n-------------------------\n");
while( *arr_of_ptr )
{
printf("%s", *arr_of_ptr++ );
}
}


/* compare 2 strings using pointers */
int p_strcmp( const void* pv1, const void* pv2 )
{
return strcmp( *(char* const*)pv1, *(char* const*)pv2 );
}

 
Reply With Quote
 
arnuld
Guest
Posts: n/a
 
      04-24-2008
> On Thu, 24 Apr 2008 10:20:07 +0000, Richard Heathfield wrote:


> He isn't dereferencing it: sizeof does not evaluate its operand (except
> for one tiny corner-case in C99 that doesn't apply here), so no
> dereferencing is taking place.


so, sizeof() operator does not evaluate its operands. It finds the number
of bytes without evaluating anything ?



--
http://lispmachine.wordpress.com/
my email ID is at the above address

 
Reply With Quote
 
arnuld
Guest
Posts: n/a
 
      04-25-2008
> On Thu, 24 Apr 2008 04:33:14 -0700, Barry Schwarz wrote:

>> On Fri, 25 Apr 2008 02:04:45 +0500, arnuld <(E-Mail Removed)> wrote:
>> if( readlines( arr_of_ptr, ARR_SIZE ) > 0 )
>> {
>> qsort( arr_of_ptr, ARR_SIZE, sizeof( char* ), p_strcmp );


> The second argument to qsort is the number of array elements to
> process. While there are ARR_SIZE elements, most of them contain NULL
> and should not be processed. readlines told you how many elements to
> process but you threw that information away.



Ohh.. No.. Why on Earth I forgot to use readlines


int main( void )
{
const int read_num = readlines( arr_of_ptr, ARR_SIZE );

if( read_num )
{
qsort( arr_of_ptr, read_num, sizeof( char* ), p_strcmp );
printlines( arr_of_ptr );
}




>> while( fgets(temp_arr, max, stdin) && num_lines < max )


> People have already told you that max is not appropriate in the call
> to fgets. If you won't listen, why post?


That was my typo, I think I need a break.




>> size_arr = strlen( temp_arr ) + 1;


> It is a nit but this will include the '\n' that fgets inserts for a
> line shorter than the buffer. Most seem to think it is worth
> eliminating.


fgets() stops reading when it will encounter a '\n' and this is what
reading a line means. So it will retain every newline and in the end will
put '\0'. I think its fine.

I did not get what you mean by "inserts for a line shorter than the
buffer". The last 2 elements of the malloc-ed array will always be '\n'
and '\0', not matter whether the line is small or large, as long it is of
STR_SIZE, I have defined for the limit.




>> int p_strcmp( const void* pv1, const void* pv2 )
>> {
>> return strcmp( *(char* const*)pv1, *(char* const*)pv2 );


> When either v1 or v2 is NULL, this invoked undefined behavior. A
> segfault is one of the nicer manifestations of same.



I just needed to replace the "max" with "STR_SIZE" and it works fine now


/* write a program to read a set of lines from input and sort them
* and then print them.
*
* version 2.0
*/


#include <stdio.h>
#include <stdlib.h>
#include <string.h>

enum MAXLINES { ARR_SIZE = 100, STR_SIZE = 1000 };

char* arr_of_ptr[ARR_SIZE];
/* char arr_of_char[STR_SIZE]; */

int readlines( char**, const int );
void printlines( char** );
int p_strcmp( const void*, const void* );


/* main() will simply call the other functions to do the job */
int main( void )
{
const int read_num = readlines( arr_of_ptr, STR_SIZE );

if( read_num )
{
qsort( arr_of_ptr, read_num, sizeof( char* ), p_strcmp );
printlines( arr_of_ptr );
}


return 0;
}


/* 1) read lines till we get the NULL,
* 2) store those lines into an array of characters <arr_of_lines>,
* 3) pointer of arry of pointers <arr_of_ptr> will point to the
* individual elements of array of characters <arr_of_lines>,
*
*/

int readlines( char* arr_of_ptr[], const int max )
{
char *p, **p_arrptr;
int num_lines, size_arr;
char temp_arr[STR_SIZE];

num_lines = 0;
p_arrptr = arr_of_ptr;

while( fgets(temp_arr, max, stdin) && num_lines < max )
{
size_arr = strlen( temp_arr ) + 1;
if( (p = malloc( size_arr * sizeof( char ))) )
{
strcpy( p, temp_arr );
*p_arrptr++ = p;
++num_lines;
}

}

return num_lines;
}



/* it will simply print the lines pointed to by the elements of
* arrays of pointers <arr_of_ptr>.
*
*/
void printlines( char* arr_of_ptr[] )
{
printf("\n-------------------------\n");
while( *arr_of_ptr )
{
printf("%s", *arr_of_ptr++ );
}
}


/* compare 2 strings using pointers */
int p_strcmp( const void* pv1, const void* pv2 )
{
return strcmp( *(char* const*)pv1, *(char* const*)pv2 );
}

========== OUTPUT =================
/home/arnuld/programs/C $ gcc -ansi -pedantic -Wall -Wextra 5-7.c
/home/arnuld/programs/C $ ./a.out
like
zzz
abc

-------------------------
abc
like
zzz
/home/arnuld/programs/C $



--
http://lispmachine.wordpress.com/
my email ID is at the above address

 
Reply With Quote
 
arnuld
Guest
Posts: n/a
 
      04-25-2008
> On Thu, 24 Apr 2008 13:19:53 +0100, Ben Bacarisse wrote:


>> arnuld wrote:
>> we are casting a <pointer to void> to <pointer to a const pointer to
>> char> , right ?

>
> Yup. qsort is given an array to sort. All it know about this data is
> how big each element is and how many there are:



so, <char** cp> can be stored in <void* vp>. I thought a ** (pointer to
pointer) needs a ** type to store but here we are storing a ** into a
single *. are they of same size ?




> +----------+----------+----------+----------+----
> base-----> | element1 | element2 | element3 | element4 | ...
> +----------+----------+----------+----------+----


> ....[SNIP]...


> int compare(const void *p1, const void * p2) {
> const char *const *cp1 = p1; /* char **cp1 = p1; in effect */
> const char *const *cp2 = p2;
> return strcmp(*cp1, *cp2);
> }
> }
> *cp1 is 'element1' -- a char * pointing at "abc\n" and *cp2 is a char
> pointing at "def\n". The code in the FAQ does the same (with one less
> const) all in a single expression. I prefer to avoid the cast. Does
> that help?



helps in knowing that you think like me on this aspect. I also do not like
casts

thanks for all the effort you put in





--
http://lispmachine.wordpress.com/
my email ID is at the above address

 
Reply With Quote
 
arnuld
Guest
Posts: n/a
 
      04-25-2008
> On Thu, 24 Apr 2008 12:10:27 +0000, Richard Tobin wrote:


> In effect, it instructs the compiler to find the number of bytes
> used by the argument. That depends on the *type* of the argument,
> not its value. And you don't need to evaluate an expression to
> determine its type: if a and b are ints, a+b is an int regardless
> of the values of a and b; if x is an array of doubles, x[5] is always
> a double, and so on.
>
> The compiler does some calculations to determine the type and hence
> the size, so you could say it evaluates *something*, but it doesn't
> evaluate the argument itself.



I never expected this to be like this way. I am quite surprised at this
C feature. Are there any other C operators who do not evaluate their
operands ?




--
http://lispmachine.wordpress.com/
my email ID is at the above address

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Sorting list vs sorting vector boltar2003@boltar.world C++ 2 07-06-2010 09:40 AM
Sorting the Input arnuld C Programming 1 09-30-2008 07:47 AM
sorting the input arnuld C++ 16 09-25-2008 04:44 PM
fired event Sorting which wasn't handled - sorting and SelectedIndexChanged Jason ASP .Net Web Controls 0 10-04-2006 02:19 PM
sorting by multiple criterias (sub-sorting) Tom Kirchner Perl Misc 3 10-11-2003 05:16 PM



Advertisments