117 lines
3.8 KiB
Markdown
117 lines
3.8 KiB
Markdown
# lec2
|
|
|
|
## Binary Bits & Bytes
|
|
|
|
> Binary Notation 0b...
|
|
|
|
Typically we see `0b` but sometimes like in many x86 assemblers we'll see `...b` to denote some bit string.
|
|
|
|
Most typically we deal with binary(when we do) in nibbles or 4 _bit_ chunks which then grouped into 2 groups of 4 to build up a byte.
|
|
Ex:`0101 1100` is a basic random byte.
|
|
For most sane solutions this is essentially the only way we __ever__ deal with binary.
|
|
|
|
> Why can't we (((save bits))) and not use nibbles?
|
|
|
|
In truth you can totally do that; but not really.
|
|
To explain let's look at some higher level C/C++ code; say you had this structure:
|
|
|
|
```
|
|
struct Point {
|
|
int x; // specifying width for clarity sake
|
|
int y;
|
|
unsigned int valid : 1;
|
|
};
|
|
```
|
|
|
|
On a typical x86 system(and many x64 systems) with no compile time optimizations this structure might look like:
|
|
|
|
```
|
|
32(int x) + 32(int y) + 1(unsigned int valid) + 7(bits of padding)
|
|
```
|
|
|
|
Why? Because while we can always calculate the address of a particular byte's address in memory we cant' or rather don't even try to do the same for bits.
|
|
The reason is simple: a 32bit CPU can calulate any number inclusively between `0` and `0xffffffff` or `4294967295`. That means we have an entropy pool large enough to have 1 number per byte but not enough to include the bits as well.
|
|
|
|
If we use that `valid` _bit-field_ in our code later like
|
|
|
|
```
|
|
if(point_ref->valid) {
|
|
/* do stuff */
|
|
}
|
|
```
|
|
|
|
The machine code instructions generated will really just check if that byte(which contains the bit we care about) is a non-zero value.
|
|
|
|
If the bit is set we have (for example) `0b0000 0001` thus a _true_ value.
|
|
|
|
## Two's Complement - aka Negate
|
|
|
|
To find the Negation of any bit-string:
|
|
|
|
i.e. `3 * -1=> -3`
|
|
|
|
1. Flip all bits in the bit-string
|
|
2. Add 1 to the bitstring
|
|
|
|
The case for 3:
|
|
|
|
```
|
|
start off: 0011 => 3
|
|
|
|
flip bits: 1100 => -2
|
|
|
|
add one: 1101 => -3
|
|
|
|
```
|
|
|
|
### Signedness
|
|
|
|
> Why?
|
|
|
|
Because this matters for dealing with `signed` and `unsigned` values. _No it doesn't mean positive and negative numbers._
|
|
Say we have 4 bytes to mess with. This means we have a range of 0000 to 1111. If we wanted purely positive numbers in this range we could have 0000 to 1111... or 0 to 15.
|
|
If we needed negative representation however, we have to sacrifice some of our range.
|
|
Our new unsigned range is then `0-7` _or in binary_: `0000 - 0111`. We say unsigned for this range because the largest number we can represent without setting the first bit is `0111` => `7`.
|
|
Our negative range is then `-8 -> -1` which in binary is `0b1000 -> 0b1111`
|
|
|
|
|
|
## Intro to hex
|
|
|
|
> Hex Notation 0x...
|
|
|
|
x86 assemblersi(masm) will typically accept `...h` as a postfix notation.
|
|
|
|
More convinient than binary for obvious reasons; namely it doesn't look like spaghetti on the screen.
|
|
|
|
Our 4-bit range from earlier {0000-1111} now becomes {00-ff}.
|
|
More pedantically our new hex range is 0x00 to 0xff.
|
|
|
|
> Binary mapped
|
|
|
|
It happens that 1 nibble makes up 0x0 to 0xF.
|
|
So for now just get used to converting {0000-1111} to one of it's respective values in hex and eventually it should be second nature.
|
|
Then just move on to using hex(like immediately after these lessons), because writing actual binary is actually awful.
|
|
|
|
> Dude trust me hex is way better to read than decimal
|
|
|
|
It may seem convenient at first but after a while you'll realized that hex has really easy to understand uses and makes this super clear + concise, especially when dealing with bit masks and bitsets.
|
|
|
|
|
|
> Ascii in Hex Dumps
|
|
|
|
Kind of a side note but most ascii text values range from 0x21 to 0x66 so if you're looking for text in a binary look for groupings of that value.
|
|
|
|
## 32 v 64 bit
|
|
|
|
In case you come from an x86_64 ish background know that in MIPS terminology changes a bit(bun intended).
|
|
|
|
> x86 byte = mips byte
|
|
|
|
> x86 word = mips half word
|
|
|
|
> x86 dword = mips word
|
|
|
|
> x86/64 qword = mips mips dword
|
|
|
|
So just keep those translations in mind...
|