Send Close Add comments: (status displays here)
Got it!  This site uses cookies. You consent to this by clicking on "Got it!" or by continuing to use this website.nbsp; Note: This appears on each machine/browser from which this site is accessed.
Binary and hex dumps


1. Binary and hex dumps
These notes are based on class on 2019-09-03.

2. Encoding issues
There may be times when programs and data do not work because of character encoding issues.

That is, everything looks fine on the outside but inside there is something wrong.

3. Car analogy
When your car is not working right, you might take your car to someone who knows about how cars work and pay them to find and fix the problem.

When your program is not working right, you might take your program to someone who knows about how programs work and pay them to find and fix the problem.

But wait! You are the programming! So maybe you need to know how to get inside the program and/or data to find a problem to fix or otherwise address it.

4. Programs
Consider the three programs, hello.c, hello1.c, and hello2.c.

The Linux cat command is used to see the contents of each file. Here is the trace (via a SSH (Secure Shell) terminal session on Centos 7).

Note: The command prompt is set to 1 [robin] [~] a in shell level 1, user robin, current path ~ (home directory).
1 [robin] [~] cat hello.c #include int main(void) { printf("Robin Snyder\n"); return 0; } 1 [robin] [~] cat hello1.c #include int main(void) { printf("Robin Snyder\n"); return 0; } 1 [robin] [~] cat hello2.c #include int main(void) { printf("Robin Snyder\n"); return 0; } 1 [robin] [~]

Do you see any differences in the three programs? There is no observable difference in the three programs. They all look the same. They also look the same in the nano command line text editor.

5. Hex and binary dump
The command line program xxd can be used to see the hex and bit level values of the underlying program/data.

Here is a trace of the three files using the hex command.
1 [robin] [~] xxd hello.c 0000000: 2369 6e63 6c75 6465 203c 7374 6469 6f2e #include <stdio. 0000010: 683e 0a0a 696e 7420 6d61 696e 2876 6f69 h>..int main(voi 0000020: 6429 0a7b 0a20 7072 696e 7466 2822 526f d).{. printf("Ro 0000030: 6269 6e20 536e 7964 6572 5c6e 2229 3b0a bin Snyder\n");. 0000040: 2072 6574 7572 6e20 303b 0a7d 0a        return 0;.}. 1 [robin] [~] xxd hello1.c 0000000: 2369 6e63 6c75 6465 203c 7374 6469 6f2e #include <stdio. 0000010: 683e 0d0a 0d0a 696e 7420 6d61 696e 2876 h>....int main(v 0000020: 6f69 6429 0d0a 7b0d 0a20 7072 696e 7466 oid)..{.. printf 0000030: 2822 526f 6269 6e20 536e 7964 6572 5c6e ("Robin Snyder\n 0000040: 2229 3b0d 0a20 7265 7475 726e 2030 3b0d ");.. return 0;. 0000050: 0a7d 0d0a 0d0a                                                   .}.... 1 [robin] [~] xxd hello2.c 0000000: 2369 6e63 6c75 6465 203c 7374 6469 6f2e #include <stdio. 0000010: 683e 0a0a 696e 7420 6d61 696e 2876 6f69 h>..int main(voi 0000020: 6429 0a7b 0a20 7072 696e 7466 2822 526f d).{. printf("Ro 0000030: 6269 6ec2 a053 6e79 6465 725c 6e22 293b bin..Snyder\n"); 0000040: 0a20 7265 7475 726e 2030 3b0a 7d0a 0a   . return 0;.}.. 1 [robin] [~]

Note that the leftmost column shows the byte offset of each line, starting at 0000000 (since computer scientists tend to start counting at zero and not at one).

Do you see the differences?

Until we get to programs that show the differences, the following shows the differences underlined.
1 [robin] [~] xxd hello.c 0000000: 2369 6e63 6c75 6465 203c 7374 6469 6f2e #include 0a0a 696e 7420 6d61 696e 2876 6f69 h>..int main(voi 0000020: 6429 0a7b 0a20 7072 696e 7466 2822 526f d).{. printf("Ro 0000030: 6269 6e20 536e 7964 6572 5c6e 2229 3b0a bin Snyder\n");. 0000040: 2072 6574 7572 6e20 303b 0a7d 0a return 0;.}. 1 [robin] [~] xxd hello1.c 0000000: 2369 6e63 6c75 6465 203c 7374 6469 6f2e #include 0d0a 0d0a 696e 7420 6d61 696e 2876 h>....int main(v 0000020: 6f69 6429 0d0a 7b0d 0a20 7072 696e 7466 oid)..{.. printf 0000030: 2822 526f 6269 6e20 536e 7964 6572 5c6e ("Robin Snyder\n 0000040: 2229 3b0d 0a20 7265 7475 726e 2030 3b0d ");.. return 0;. 0000050: 0a7d 0d0a 0d0a .}.... 1 [robin] [~] xxd hello2.c 0000000: 2369 6e63 6c75 6465 203c 7374 6469 6f2e #include ..int main(voi 0000020: 6429 0a7b 0a20 7072 696e 7466 2822 526f d).{. printf("Ro 0000030: 6269 6ec2 a053 6e79 6465 725c 6e22 293b bin..Snyder\n"); 0000040: 0a20 7265 7475 726e 2030 3b0a 7d0a 0a . return 0;.}.. 1 [robin] [~]


6. Line separators
One important change involves the following characters. Traditionally, Linux uses the line feed character to separate lines in a text file. Windows uses the carriage return and line feed character pair to separate lines in a text file.

In Windows, Notepad does not understand the difference so Linux text files opened in Notepad will be on one long line. Notedpa++ does understand the difference.

But sometimes these differences are more important. For example, a Bash script file in Windows format will not run in a Linux environment (unless, of course, there is only one line in the text file so that there are no line separators or breaks).

7. Spaces
Another important change involves the following characters. This is the space between the "Robin" and "Snyder" in "Robin Snyder". The HTML for a non-breaking space includes the following.
8. Acronyms and/or initialisms for this page