Deep dive into bits, bytes, shorts, ints, longs, signed, and unsigned with Java
On the Pi4J discussion list, someone recently asked what the best and easiest way is in Java to convert a byte value. In Java, there is no distinction between signed and unsigned bytes, which can be confusing. My book “Getting Started with Java on the Raspberry Pi” contains an explanation about this, and I am happy to share it in this post with some more info and code examples…
You can find all the code of this post on GitHub.
The Basics: Bits
Let’s start with the basics: bits, 0 or 1.
A bit (binary digit) is the smallest unit of data in a computer, and has two possible values: 0 or 1. Bits are the foundation of everything that happens in a computer. All instructions or data stored in memory are represented as combinations of bits. Bits are mostly grouped together into larger units such as bytes or words, so they can represent numbers, characters, or control signals, depending on the context in which they are used.
In everyday life, we are used to decimal values where we group everything by 10, 20, 30,… In programming, hexadecimal (or hex) values are often used, which group numbers by sixteen. Hexadecimal values range from 0 to 15, which matches perfectly with the maximum value of four bits (1111 in binary). Each hex digit can represent a value from 0 to F, where F is 15 in decimal. A hex value is typically written as 0x0 to 0xF to distinguish it from decimal notation.
Each binary digit (bit) represents a power of 2, starting from the rightmost bit (which is 2^0) and moving left. The value of the binary number is the sum of the powers of 2 where there is a bit 1.
The following table shows all possible combinations of 4 bits ranging from “0000” to “1111”.
| Bits | 2^3 | 2^2 | 2^1 | 2^0 | + | Number | HEX |
|---|---|---|---|---|---|---|---|
| 8 | 4 | 2 | 1 | ||||
| 0000 | 0 | 0 | 0 | 0 | 0+0+0+0 | 0 | 0x0 |
| 0001 | 0 | 0 | 0 | 1 | 0+0+0+1 | 1 | 0x1 |
| 0010 | 0 | 0 | 1 | 0 | 0+0+2+0 | 2 | 0x2 |
| 0011 | 0 | 0 | 1 | 1 | 0+0+2+1 | 3 | 0x3 |
| 0100 | 0 | 1 | 0 | 0 | 0+4+0+0 | 4 | 0x4 |
| 0101 | 0 | 1 | 0 | 1 | 0+4+0+1 | 5 | 0x5 |
| 0110 | 0 | 1 | 1 | 0 | 0+4+2+0 | 6 | 0x6 |
| 0111 | 0 | 1 | 1 | 1 | 0+4+2+1 | 7 | 0x7 |
| 1000 | 1 | 0 | 0 | 0 | 8+0+0+0 | 8 | 0x8 |
| 1001 | 1 | 0 | 0 | 1 | 8+0+0+1 | 9 | 0x9 |
| 1010 | 1 | 0 | 1 | 0 | 8+0+2+0 | 10 | 0xA |
| 1011 | 1 | 0 | 1 | 1 | 8+0+2+1 | 11 | 0xB |
| 1100 | 1 | 1 | 0 | 0 | 8+4+0+0 | 12 | 0xC |
| 1101 | 1 | 1 | 0 | 1 | 8+4+0+1 | 13 | 0xD |
| 1110 | 1 | 1 | 1 | 0 | 8+4+2+0 | 14 | 0xE |
| 1111 | 1 | 1 | 1 | 1 | 8+4+2+1 | 15 | 0xF |
This video by Mathmo14159 very nicely illustrates how this works.
And we can achieve the same result with the following Java code:
System.out.println("Value\tBits\tHex");
for (int i = 0; i <= 15; i++) {
System.out.println(i
+ "\t" + String.format("%4s", Integer.toBinaryString(i)).replace(' ', '0')
+ "\t0x" + Integer.toHexString(i).toUpperCase());
}
// Output
Value Bits Hex
0 0000 0x0
1 0001 0x1
2 0010 0x2
3 0011 0x3
4 0100 0x4
5 0101 0x5
6 0110 0x6
7 0111 0x7
8 1000 0x8
9 1001 0x9
10 1010 0xA
11 1011 0xB
12 1100 0xC
13 1101 0xD
14 1110 0xE
15 1111 0xF
Bits to Byte
A byte consists of 8 bits and has the range of 0x00 (= 0) to 0xFF (= 255).
So we need to extend the table above to have 8 bits. Let’s take a few examples:
| Bits | 2^7 | 2^6 | 2^5 | 2^4 | 2^3 | 2^2 | 2^1 | 2^0 | + | Total | HEX |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 128 | 64 | 32 | 16 | 8 | 4 | 2 | 1 | ||||
| 00000001 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | x01 |
| 00000010 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 2 | 2 | x02 |
| 00000011 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 2+1 | 3 | x03 |
| 00000100 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 4 | 4 | x04 |
| 00001111 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 8+4+2+1 | 15 | x0F |
| 00011111 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 16+…+1 | 31 | x1F |
| 00100000 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 32 | 32 | x20 |
| 11111111 | 1 | 1 | 1 |