Velocity Reviews > Why does this work? (rot13 function)

# Why does this work? (rot13 function)

Eirik
Guest
Posts: n/a

 12-16-2003
This is a little function I wrote, inspired by the thread
"Urgent HELP! required for Caesar Cipher PLEASE"

\$ cat /home/keisar/bin/c/ymse/rot13.h

char rot13(char character)
{
int changed;
changed = character - 'a' + 'n';
return changed;
}

1) I don't have to specify that b should be replaced by n,
c by o and so on. How come?
2) The function returns a char(char rot13), but changed
is an integer. How is that possible?

Joona I Palaste
Guest
Posts: n/a

 12-16-2003
Eirik <(E-Mail Removed)> scribbled the following:
> This is a little function I wrote, inspired by the thread
> "Urgent HELP! required for Caesar Cipher PLEASE"

> \$ cat /home/keisar/bin/c/ymse/rot13.h

> char rot13(char character)
> {
> int changed;
> changed = character - 'a' + 'n';
> return changed;
> }

> 1) I don't have to specify that b should be replaced by n,
> c by o and so on. How come?

Actually, if we're strict about the C standard, you DO have to
specify that b should be replaced by n and so on. You happen to be
using a character set where 'a'...'z' are contiguous, but the C
standard allows for other character sets.
Provided 'a'...'z' are contiguous, the above code works, because
chars are just integer types in C.

> 2) The function returns a char(char rot13), but changed
> is an integer. How is that possible?

Because chars are integer types. Unlike Pascal, C does not require
conversion functions between characters and their numeric values,
but treats them as interchangable by themselves.

--
/-- Joona Palaste ((E-Mail Removed)) ------------- Finland --------\
\-- http://www.helsinki.fi/~palaste --------------------- rules! --------/
"Show me a good mouser and I'll show you a cat with bad breath."
- Garfield

Hallvard B Furuseth
Guest
Posts: n/a

 12-16-2003
Eirik wrote:

> This is a little function I wrote, inspired by the thread
> "Urgent HELP! required for Caesar Cipher PLEASE"
>
> \$ cat /home/keisar/bin/c/ymse/rot13.h
>
> char rot13(char character)
> {
> int changed;
> changed = character - 'a' + 'n';
> return changed;
> }

See Joona's reply. Except, it only works for characters a-m and A-M.
It gives the wrong result for n-z and N-Z. Try this:

#include <ctype.h>

char rot13(char character)
{
int changed;
if (tolower((unsigned char)character) >= 'n')
changed = character - 'n' + 'a';
else
changed = character + 'n' - 'a';
return changed;
}

Still only works for some character sets, of course.

The argument to tolower() is cast to unsigned char because a 'char'
can be negative, while tolower() & co expect characters to be in
the range of 'unsigned char'.

--
Hallvard

Arthur J. O'Dwyer
Guest
Posts: n/a

 12-16-2003

On Tue, 16 Dec 2003, Eirik wrote:
>
> This is a little function I wrote, inspired by the thread
> "Urgent HELP! required for Caesar Cipher PLEASE"
>
> \$ cat /home/keisar/bin/c/ymse/rot13.h

Something nobody else has pointed out yet: Executable code
like the function below should *not* be in a header file (ending
with ".h", as you have above). It should be in a separate
translation unit, in a source file ending with ".c", and you
should learn how to use your compiler to compile projects
consisting of multiple ".c" source files.
<OT> Using gcc, it's easy:
% gcc -W -Wall -ansi -pedantic -O2 mainfile.c rot13.c
(and any other source files in the project). All the options
are only there to catch mistakes in your code; if you write
perfect code, you don't need them. [In other words, you
*do* need them. Always.]
</OT>

> char rot13(char character)
> {
> int changed;
> changed = character - 'a' + 'n';
> return changed;
> }
>
> 1) I don't have to specify that b should be replaced by n,
> c by o and so on. How come?

Because your system, like most systems in the world today,
uses ASCII to represent characters. Part of the ASCII character
table looks like this:

"...]^_`abcdefghijklmnopqrstuvwxyz{|}..."

See how all the lowercase letters are packed together, in order?
That's why your code works (do the math yourself as to why it
converts 'b' to 'o').
As Joona notes, all your code is doing is adding 'n'-'a', or 13
(assuming ASCII), to the character it receives. Which is why it
fails miserably to convert 'o' to 'b', or 'z' to 'a'.
Your code won't work on some other real-life systems out there,
and it might even cause demons to fly out of your nose on the
Death Station 9000. (Google for it.) So it's not really the
best way to do it if you're writing code that's supposed to work
everywhere.

> 2) The function returns a char(char rot13), but changed
> is an integer. How is that possible?

You wrote 'int' instead of 'char', that's how. In C,
characters are treated just like little integers, so you can
do arithmetic on them, as you've already figured out. And of
course you can assign 'char' to 'int' and vice versa, because
that just involves narrowing or widening the integers involved.

Here's how you could write that code more portably, so it
wouldn't depend on the organization of the letters in your
character set:

#include <ctype.h>

int rot13(int c)
{
static char lookup[UCHAR_MAX] = {0};
static char Alpha[] = "abcdefghijklmnopqrstuvwxyz";
static int not_initialized_yet = 1;

if (not_initialized_yet) {
unsigned int i;
for (i=0; i < sizeof lookup; ++i) {
lookup[i] = i;
}
for (i=0; i < sizeof Alpha; ++i) {
lookup[Alpha[i]] = Alpha[(i+13) % 26];
lookup[toupper(Alpha[i])] = toupper(Alpha[(i+13) % 26]);
}
not_initialized_yet = 0;
}

return lookup[c];
}

Doesn't that look complicated, now? But note that most of the
time -- *all* the time after the first time you call the function --
it doesn't even need to do any arithmetic! It's just a simple
table lookup, plus some complicated stuff to initialize the table.

I changed 'char' to 'int' to bring 'rot13' in line with similar
standard functions like 'toupper', which take and return 'int'.
As you've found out, 'char' is *almost* always replaceable by 'int'
(one big exception being text strings, obviously).

HTH,
-Arthur

Larry Doolittle
Guest
Posts: n/a

 12-16-2003
Arthur J. O'Dwyer wrote:
> <OT> Using gcc, it's easy:
> % gcc -W -Wall -ansi -pedantic -O2 mainfile.c rot13.c
> (and any other source files in the project). All the options
> are only there to catch mistakes in your code; if you write
> perfect code, you don't need them. [In other words, you
> *do* need them. Always.]
></OT>

My biggest complaint about gcc -W -Wall is that it barks at unused
parameters. If someone tells me how to make

int foo(void);
int bar(char*cd)
{
foo();
return 0;
}

standards-conforming, non-bloated, and run past "gcc -W -Wall -ansi -std=c99"
without warnings, I'd like to hear about it. My default set of
flags to gcc now also includes

- Larry

Larry Doolittle
Guest
Posts: n/a

 12-16-2003
In article <(E-Mail Removed)>, Larry Doolittle wrote:
>> <OT> Using gcc, [the mandatory flags are]:
>> % gcc -W -Wall -ansi -pedantic -O2 mainfile.c rot13.c
>></OT>

>
> My biggest complaint about gcc -W -Wall is that it barks at unused
> parameters. If someone tells me how to make [a perfectly good c program]
> standards-conforming, non-bloated, and run past "gcc -W -Wall -ansi -std=c99"
> without warnings, I'd like to hear about it.

Sorry, guys, somehow I thought the fix might be _inside_ the C program.
But just a _little_ more experimentation taught me that there is really
nothing wrong with the program, and I just need to add -Wno-unused-parameter
to the gcc options string. So this whole thing is still <OT>, although
I had myself confused enough I didn't realize it.

- Larry

Eric Sosman
Guest
Posts: n/a

 12-16-2003
Larry Doolittle wrote:
>
> Arthur J. O'Dwyer wrote:
> > <OT> Using gcc, it's easy:
> > % gcc -W -Wall -ansi -pedantic -O2 mainfile.c rot13.c
> > (and any other source files in the project). All the options
> > are only there to catch mistakes in your code; if you write
> > perfect code, you don't need them. [In other words, you
> > *do* need them. Always.]
> ></OT>

>
> My biggest complaint about gcc -W -Wall is that it barks at unused
> parameters. If someone tells me how to make
>
> int foo(void);
> int bar(char*cd)
> {
> foo();
> return 0;
> }
>
> standards-conforming, non-bloated, and run past "gcc -W -Wall -ansi -std=c99"
> without warnings, I'd like to hear about it. My default set of
> flags to gcc now also includes

int bar(char *cd) {
(void)cd;
foo();
return 0;
}

Of course, the Standard provides no way to prevent the compiler
from issuing whatever warnings it feels like. If it objects to
the color of your socks it's free to say so, so long as it
accepts and runs an otherwise conforming program despite the
sartorial cluelessness of the coder.

--
http://www.velocityreviews.com/forums/(E-Mail Removed)

Martin Dickopp
Guest
Posts: n/a

 12-16-2003
Larry Doolittle <(E-Mail Removed)> writes:

> My biggest complaint about gcc -W -Wall is that it barks at unused
> parameters. If someone tells me how to make
>
> int foo(void);
> int bar(char*cd)
> {
> foo();
> return 0;
> }
>
> standards-conforming, non-bloated, and run past "gcc -W -Wall -ansi -std=c99"
> without warnings, I'd like to hear about it.

This is not standard conforming, but usually close enough in practice:

#ifdef __GNUC__
# define unused __attribute__ ((unused))
#else
# define unused
#endif

int foo(void);
int bar(char*cd unused)
{
foo();
return 0;
}

(BTW, `-ansi' is an alias for `-std=c89', so it makes little sense specify
both `-ansi' and `-std=c99'.)

Martin

CBFalconer
Guest
Posts: n/a

 12-17-2003
Larry Doolittle wrote:
>

.... snip ...
>
> My biggest complaint about gcc -W -Wall is that it barks at unused
> parameters. If someone tells me how to make
>
> int foo(void);
> int bar(char*cd)
> {
> foo();
> return 0;
> }
>
> standards-conforming, non-bloated, and run past "gcc -W -Wall -ansi
> -std=c99" without warnings, I'd like to hear about it. My default
> set of flags to gcc now also includes

Rewrite it as:

int foo(void);
int bar(void)
{
foo();
return 0;
}

and you will get no warnings You will also save the overhead
of passing unused parameters in each call. You should also be
using -pedantic.

--
Chuck F ((E-Mail Removed)) ((E-Mail Removed))
Available for consulting/temporary embedded and systems.

Larry Doolittle
Guest
Posts: n/a

 12-17-2003
In article <(E-Mail Removed)>, CBFalconer wrote:
> Larry Doolittle wrote:
>>

> ... snip ...
>> If someone tells me how to make
>>
>> int foo(void);
>> int bar(char*cd)
>> {
>> foo();
>> return 0;
>> }
>>
>> standards-conforming, non-bloated, and run past "gcc -W -Wall -ansi
>> -std=c99" without warnings, I'd like to hear about it. [chop]

>
> Rewrite it as:
>
> int foo(void);
> int bar(void)
> {
> foo();
> return 0;
> }
>
> and you will get no warnings You will also save the overhead
> of passing unused parameters in each call.

The point is that bar(char *) is defined to fit into a
general interface. Other functions are defined that _do_
use their parameters, and function pointers to both bar
and these other functions are passed around or stored in
tables. These function pointers have to all have the
same type.

Even a contrived example would take more space than I think
people want to wade through on this newsgroup. For a widely
used example, just look at <OT> POSIX sigaction subsystem,
in particular the sa_sigaction function </OT>.

> You should also be using -pedantic.

I do. And as pointed out elsethread, I "fixed" the (gcc-specific)
warnings by adding the (gcc-specific) "-Wno-unused-parameter" flag.
Interestingly, this adjustment is not needed with either "-W" or
"-Wall", only with both. In retrospect, I might have been able
to deduct that from the man page.

- Larry