Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > A few questiosn about encoding

Reply
Thread Tools

A few questiosn about encoding

 
 
Antoon Pardon
Guest
Posts: n/a
 
      06-14-2013
Op 13-06-13 10:08, Νικόλαος Κούρας schreef:
> On 13/6/2013 10:58 πμ, Chris Angelico wrote:
>> On Thu, Jun 13, 2013 at 5:42 PM, �������� ������
>> <(E-Mail Removed)> wrote:
>>> On 13/6/2013 10:11 ��, Steven D'Aprano wrote:
>>>> No! That creates a string from 16474 in base two:
>>>> '0b100000001011010'
>>>
>>> I disagree here.
>>> 16474 is a number in base 10. Doing bin(16474) we get the binary
>>> representation of number 16474 and not a string.
>>> Why you say we receive a string while python presents a binary number?

>>
>> You can disagree all you like. Steven cited a simple point of fact,
>> one which can be verified in any Python interpreter. Nikos, you are
>> flat wrong here; bin(16474) creates a string.

>
> Indeed python embraced it in single quoting '0b100000001011010' and
> not as 0b100000001011010 which in fact makes it a string.
>
> But since bin(16474) seems to create a string rather than an expected
> number(at leat into my mind) then how do we get the binary
> representation of the number 16474 as a number?


You don't. You should remember that python (or any programming language)
doesn't print numbers. It always prints string representations of
numbers. It is just so that we are so used to the decimal representation
that we think of that representation as being the number.

Normally that is not a problem but it can cause confusion when you are
working with mulitple representations.

--
Antoon Pardon

 
Reply With Quote
 
 
 
 
Nick the Gr33k
Guest
Posts: n/a
 
      06-14-2013
On 14/6/2013 10:36 πμ, Antoon Pardon wrote:
> Op 13-06-13 10:08, Νικόλαος Κούρας schreef:
>> On 13/6/2013 10:58 πμ, Chris Angelico wrote:
>>> On Thu, Jun 13, 2013 at 5:42 PM, �������� ������
>>> <(E-Mail Removed)> wrote:
>>>> On 13/6/2013 10:11 ��, Steven D'Aprano wrote:
>>>>> No! That creates a string from 16474 in base two:
>>>>> '0b100000001011010'
>>>>
>>>> I disagree here.
>>>> 16474 is a number in base 10. Doing bin(16474) we get the binary
>>>> representation of number 16474 and not a string.
>>>> Why you say we receive a string while python presents a binary number?
>>>
>>> You can disagree all you like. Steven cited a simple point of fact,
>>> one which can be verified in any Python interpreter. Nikos, you are
>>> flat wrong here; bin(16474) creates a string.

>>
>> Indeed python embraced it in single quoting '0b100000001011010' and
>> not as 0b100000001011010 which in fact makes it a string.
>>
>> But since bin(16474) seems to create a string rather than an expected
>> number(at leat into my mind) then how do we get the binary
>> representation of the number 16474 as a number?

>
> You don't. You should remember that python (or any programming language)
> doesn't print numbers. It always prints string representations of
> numbers. It is just so that we are so used to the decimal representation
> that we think of that representation as being the number.
>
> Normally that is not a problem but it can cause confusion when you are
> working with mulitple representations.

Hold on!
Youa re basically saying here that:


>>> 16474

16474

is nto a number as we think but instead is string representation of a
number?

I dont think so, if it were a string representation of a number that
would print the following:

>>> 16474

'16474'

Python prints numbers:

>>> 16474

16474
>>> 0b100000001011010

16474
>>> 0x405a

16474

it prints them all to decimal format though.
but when we need a decimal integer to be turned into bin() or hex() we
can bin(number) hex(number) and just remove the pair of single quoting.

--
What is now proved was at first only imagined!
 
Reply With Quote
 
 
 
 
Antoon Pardon
Guest
Posts: n/a
 
      06-14-2013
Op 14-06-13 09:49, Nick the Gr33k schreef:
> On 14/6/2013 10:36 πμ, Antoon Pardon wrote:
>> Op 13-06-13 10:08, Νικόλαος Κούρας schreef:
>>>
>>> Indeed python embraced it in single quoting '0b100000001011010' and
>>> not as 0b100000001011010 which in fact makes it a string.
>>>
>>> But since bin(16474) seems to create a string rather than an expected
>>> number(at leat into my mind) then how do we get the binary
>>> representation of the number 16474 as a number?

>>
>> You don't. You should remember that python (or any programming language)
>> doesn't print numbers. It always prints string representations of
>> numbers. It is just so that we are so used to the decimal representation
>> that we think of that representation as being the number.
>>
>> Normally that is not a problem but it can cause confusion when you are
>> working with mulitple representations.

> Hold on!
> Youa re basically saying here that:
>
>
> >>> 16474

> 16474
>
> is nto a number as we think but instead is string representation of a
> number?

Yes, or if you prefer what python prints is the decimal notation of the number.

>
> I dont think so, if it were a string representation of a number that
> would print the following:
>
> >>> 16474

> '16474'


No it wouldn't, You are confusing representation in the everyday meaning
with representation as python jargon.


> Python prints numbers:

No it doesn't, numbers are abstract concepts that can be represented in
various notations, these notations are strings. Those notaional strings
end up being printed. As I said before we are so used in using the
decimal notation that we often use the notation and the number interchangebly
without a problem. But when we are working with multiple notations that
can become confusing and we should be careful to seperate numbers from their
representaions/notations.


> but when we need a decimal integer


There are no decimal integers. There is only a decimal notation of the number.
Decimal, octal etc are not characteristics of the numbers themselves.

--

Antoon Pardon

 
Reply With Quote
 
Nick the Gr33k
Guest
Posts: n/a
 
      06-14-2013
On 14/6/2013 11:22 πμ, Antoon Pardon wrote:

>> Python prints numbers:

> No it doesn't, numbers are abstract concepts that can be represented in
> various notations, these notations are strings. Those notaional strings
> end up being printed. As I said before we are so used in using the
> decimal notation that we often use the notation and the number interchangebly
> without a problem. But when we are working with multiple notations that
> can become confusing and we should be careful to seperate numbers from their
> representaions/notations.


How do we separate a number then from its represenation-natation?

What is a notation anywat? is it a way of displayment? but that would be
a represeantion then....

Please explain this line as it uses both terms.

No it doesn't, numbers are abstract concepts that can be represented in
various notations

>> but when we need a decimal integer

>
> There are no decimal integers. There is only a decimal notation of the number.
> Decimal, octal etc are not characteristics of the numbers themselves.


So everything we see like:

16474
nikos
abc123

everything is a string and nothing is a number? not even number 1?

--
What is now proved was at first only imagined!
 
Reply With Quote
 
Heiko Wundram
Guest
Posts: n/a
 
      06-14-2013
Am 14.06.2013 10:37, schrieb Nick the Gr33k:
> So everything we see like:
>
> 16474
> nikos
> abc123
>
> everything is a string and nothing is a number? not even number 1?


Come on now, this is _so_ obviously trolling, it's not even remotely
funny anymore. Why doesn't killfiling work with the mailing list version
of the python list?

--
--- Heiko.
 
Reply With Quote
 
Nick the Gr33k
Guest
Posts: n/a
 
      06-14-2013
On 14/6/2013 12:06 μμ, Heiko Wundram wrote:
> Am 14.06.2013 10:37, schrieb Nick the Gr33k:
>> So everything we see like:
>>
>> 16474
>> nikos
>> abc123
>>
>> everything is a string and nothing is a number? not even number 1?

>
> Come on now, this is _so_ obviously trolling, it's not even remotely
> funny anymore. Why doesn't killfiling work with the mailing list version
> of the python list?
>


I'mm not trolling man, i just have hard time understanding why numbers
acts as strings.

--
What is now proved was at first only imagined!
 
Reply With Quote
 
Cameron Simpson
Guest
Posts: n/a
 
      06-14-2013
On 14Jun2013 09:59, Nikos as SuperHost Support <(E-Mail Removed)> wrote:
| On 14/6/2013 4:00 πμ, Cameron Simpson wrote:
| >On 13Jun2013 17:19, Nikos as SuperHost Support <(E-Mail Removed)> wrote:
| >| A code-point and the code-point's ordinal value are associated into
| >| a Unicode charset. They have the so called 1:1 mapping.
| >|
| >| So, i was under the impression that by encoding the code-point into
| >| utf-8 was the same as encoding the code-point's ordinal value into
| >| utf-8.
| >|
| >| So, now i believe they are two different things.
| >| The code-point *is what actually* needs to be encoded and *not* its
| >| ordinal value.
| >
| >Because there is a 1:1 mapping, these are the same thing: a code
| >point is directly _represented_ by the ordinal value, and the ordinal
| >value is encoded for storage as bytes.
|
| So, you are saying that:
|
| chr(16474).encode('utf-8') #being the code-point encoded
|
| ord(chr(16474)).encode('utf-8') #being the code-point's ordinal
| encoded which gives an error.
|
| that shows us that a character is what is being be encoded to utf-8
| but the character's ordinal cannot.
|
| So, whay you say "....and the ordinal value is encoded for storage
| as bytes." ?

No, I mean conceptually, there is no difference between a codepoint
and its ordinal value. They are the same thing.

Inside Python itself, a character (a string of length 1; there is
no separate character type) is a distinct type. Interally, the
characters in a string are stored numericly. As Unicode codepoints,
as their ordinal values.

It is a meaningful idea to store a Python string encoded into bytes
using some text encoding scheme (utf-8, iso-8859-7, what have you).

It is not a meaningful thing to store a number "encoded" without
some more context. The .encode() method that accepts an encoding
name like "utf-8" is specificly an encoding procedure FOR TEXT.

So strings have such a method, and integers do not.

When you write:

chr(16474)

you receive a _string_, containing the single character whose ordinal
is 16474. It is meaningful to transcribe this string to bytes using
a text encoding procedure like 'utf-8'.

When you write:

ord(chr(16474))

you get an integer. Because ord() is the reverse of chr(), you get
the integer 16474.

Integers do not have .encode() methods that accept a _text_ encoding
name like 'utf-8' because integers are not text.

| >| > The leading 0b is just syntax to tell you "this is base 2, not base 8
| >| > (0o) or base 10 or base 16 (0x)". Also, leading zero bits are dropped.
| >|
| >| But byte objects are represented as '\x' instead of the
| >| aforementioned '0x'. Why is that?
| >
| >You're confusing a "string representation of a single number in
| >some base (eg 2 or 16)" with the "string-ish representation of a
| >bytes object".
|
| >>> bin(16474)
| '0b100000001011010'
| that is a binary format string representation of number 16474, yes?

Yes.

| >>> hex(16474)
| '0x405a'
| that is a hexadecimal format string representation of number 16474, yes?

Yes.

| WHILE:
| b'abc\x1b\n' = a string representation of a byte, which in turn is a
| series of integers, so that makes this a string representation of
| integers, is this correct?

A "bytes" Python object. So not "a byte", 5 bytes.
It is a string representation of the series of byte values,
ON THE PREMISE that the bytes may well represent text.
On that basis, b'abc\x1b\n' is a reasonable way to display them.

In other contexts this might not be a sensible way to display these
bytes, and then another format would be chosen, possibly hand
constructed by the programmer, or equally reasonable, the hexlify()
function from the binascii module.

| \x1b = ESC character

Considering the bytes to be representing characters, then yes.

| \ = for seperating bytes

No, \ to introduce a sequence of characters with special meaning.

Normally a character in a b'...' item represents the byte value
matching the character's Unicode ordinal value. But several characters
are hard or confusing to place literally in a b'...' string. For
example a newline character or and escape character.

'a' means 65.
'\n' means 10 (newline, hence the 'n').
'\x1b' means 33 (escape, value 27, value 0x1b in hexadecimal).
And, of course, '\\' means a literal slosh, value 92.

| x = to flag that the following bytes are going to be represented as
| hex values? whats exactly 'x' means here? character perhaps?

A slosh followed by an 'x' means there will be 2 hexadecimal digits
to follow, and those two digits represent the byte value.

So, yes.

| Still its not clear into my head what the difference of '0x1b' and
| '\x1b' is:

They're the same thing in two similar but slightly different formats.

0x1b is a legitimate "bare" integer value in Python.

\x1b is a sequence you find inside strings (and "byte" strings, the
b'...' format).

| i think:
| 0x1b = an integer represented in hex format

Yes.

| \x1b = a character represented in hex format

Yes.

| >| How can i view this byte's object representation as hex() or as bin()?
| >
| >See above. A bytes is a _sequence_ of values. hex() and bin() print
| >individual values in hexadecimal or binary respectively.
|
| >>> for value in b'\x97\x98\x99\x27\x10':
| ... print(value, hex(value), bin(value))
| ...
| 151 0x97 0b10010111
| 152 0x98 0b10011000
| 153 0x99 0b10011001
| 39 0x27 0b100111
| 16 0x10 0b10000
|
|
| >>> for value in b'abc\x1b\n':
| ... print(value, hex(value), bin(value))
| ...
| 97 0x61 0b1100001
| 98 0x62 0b1100010
| 99 0x63 0b1100011
| 27 0x1b 0b11011
| 10 0xa 0b1010
|
|
| Why these two give different values when printed?

97 is in base 10 (9*10+7=97), but the notation '\x97' is base 16, so 9*16+7=151.

Cheers,
--
Cameron Simpson <(E-Mail Removed)>

I'm Bubba of Borg. Y'all fixin' to be assimilated.
 
Reply With Quote
 
Heiko Wundram
Guest
Posts: n/a
 
      06-14-2013
Am 14.06.2013 11:32, schrieb Nick the Gr33k:
> I'mm not trolling man, i just have hard time understanding why numbers
> acts as strings.


If you can't grasp the conceptual differences between numbers and
their/a representation, it's probably best if you stayed away from
programming alltogether.

I don't think you're actually as thick as you sound, but rather either
you're simply too damn lazy to take the time to inform yourself from all
the hints/links/information you've been given, or you're trolling. I'm
still leaning towards the second.

--
--- Heiko.
 
Reply With Quote
 
Cameron Simpson
Guest
Posts: n/a
 
      06-14-2013
On 14Jun2013 11:37, Nikos as SuperHost Support <(E-Mail Removed)> wrote:
| On 14/6/2013 11:22 πμ, Antoon Pardon wrote:
|
| >>Python prints numbers:
| >No it doesn't, numbers are abstract concepts that can be represented in
| >various notations, these notations are strings. Those notaional strings
| >end up being printed. As I said before we are so used in using the
| >decimal notation that we often use the notation and the number interchangebly
| >without a problem. But when we are working with multiple notations that
| >can become confusing and we should be careful to seperate numbers from their
| >representaions/notations.
|
| How do we separate a number then from its represenation-natation?

Shrug. When you "print" a number, Python transcribes a string
representation of it to your terminal.

| What is a notation anywat? is it a way of displayment? but that
| would be a represeantion then....

Yep. Same thing. A "notation" is a particulart formal method of
representation.

| No it doesn't, numbers are abstract concepts that can be represented in
| various notations
|
| >>but when we need a decimal integer
| >
| >There are no decimal integers. There is only a decimal notation of the number.
| >Decimal, octal etc are not characteristics of the numbers themselves.
|
| So everything we see like:
|
| 16474
| nikos
| abc123
|
| everything is a string and nothing is a number? not even number 1?

Everything you see like that is textual information. Internally to
Python, various types are used: strings, bytes, integers etc. But
when you print something, text is output.

Cheers,
--
Cameron Simpson <(E-Mail Removed)>

A long-forgotten loved one will appear soon. Buy the negatives at any price.
 
Reply With Quote
 
Fbio Santos
Guest
Posts: n/a
 
      06-14-2013
On 14 Jun 2013 10:20, "Heiko Wundram" <(E-Mail Removed)> wrote:
>
> Am 14.06.2013 10:37, schrieb Nick the Gr33k:
>>
>> So everything we see like:
>>
>> 16474
>> nikos
>> abc123
>>
>> everything is a string and nothing is a number? not even number 1?

>
>
> Come on now, this is _so_ obviously trolling, it's not even remotely

funny anymore. Why doesn't killfiling work with the mailing list version of
the python list?

I have skimmed the archives for this month, and I estimate that a third of
this month's activity on this list was helping this person. About 80% of
that is wasted in explaining basic concepts he refuses to read in links
given to him. A depressingly large number of replies to his posts are
seemingly ignored.

Since this is a lot of spam, I feel like leaving the list, but I also
honestly want to help people use python and the replies to questions of
others often give me much insight on several matters.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
I need your advices about C prg. Dogukan Bayraktar C Programming 76 06-16-2013 08:54 AM
Survey about Software Integration Martin Dias C Programming 0 04-29-2013 03:23 PM
Quesion about running a exe file in Python(Not enough memory) yuyaxuan0@gmail.com Python 5 04-26-2013 06:30 AM
silly question about Running a script from the command line A.Rock Python 0 04-10-2013 11:21 AM
newbie question about confusing exception handling in urllib cabbar@gmail.com Python 6 04-09-2013 07:11 PM



Advertisments