19. Binary Data Handling

The struct module in Python provides tools to work with binary data, especially when you need to pack and unpack data into binary formats for file storage or communication. It is commonly used for reading and writing binary files or for working with binary data in various formats (like network protocols, image formats, or audio files).

Key Functions of `struct` Module:

struct.pack(): Converts data into binary representation.
struct.unpack(): Converts binary data into Python data types.
struct.calcsize(): Returns the size of a struct (in bytes).

Here are some Python code examples demonstrating binary data handling using the struct module:

1. Packing and Unpacking Data

This example demonstrates how to pack data into a binary format and then unpack it back into Python data types.

import struct

# Packing data into binary
data = (1, 2.5, b'abc')
packed_data = struct.pack('I f 3s', *data)  # 'I' for int, 'f' for float, '3s' for 3-byte string
print(f"Packed Data: {packed_data}")

# Unpacking binary data back to Python data types
unpacked_data = struct.unpack('I f 3s', packed_data)
print(f"Unpacked Data: {unpacked_data}")

Explanation:

The format string 'I f 3s' specifies the data types:
- I: unsigned integer (4 bytes)
- f: float (4 bytes)
- 3s: string of 3 characters
pack() and unpack() allow conversion between Python data and binary data.

2. Reading and Writing Binary Files

This example shows how to read and write binary data to a file.

Writing binary data to a file:

import struct

# Data to write
data = (1, 2.5, b'abc')

# Open a file in write-binary mode
with open('binary_data.dat', 'wb') as f:
    packed_data = struct.pack('I f 3s', *data)
    f.write(packed_data)
    print("Data written to binary file")

Reading binary data from a file:

import struct

# Open the binary file in read-binary mode
with open('binary_data.dat', 'rb') as f:
    packed_data = f.read()
    unpacked_data = struct.unpack('I f 3s', packed_data)
    print(f"Unpacked Data from file: {unpacked_data}")

Explanation:

Writing: struct.pack() converts the data into a binary format, which is then written to the file using the write() method.
Reading: The read() method reads the binary data from the file, which is then unpacked back into its original form using struct.unpack().

3. Handling Binary Data with Variable Length Strings

Sometimes you may need to handle binary data with strings of variable length. Here is an example of packing and unpacking binary data with variable-length strings.

import struct

# Data with a variable length string
data = (123, b"Hello, World!")

# Pack the data with a dynamic string length
packed_data = struct.pack('I 13s', data[0], data[1])
print(f"Packed Data: {packed_data}")

# Unpack the binary data
unpacked_data = struct.unpack('I 13s', packed_data)
print(f"Unpacked Data: {unpacked_data}")

Explanation:

The format string I 13s specifies an integer (I) and a fixed-length string of 13 characters (13s).
This example assumes the string will always be 13 characters. If the length varies, you might need to adjust the format string accordingly or use dynamic unpacking techniques.

4. Using `calcsize` to Determine Structure Size

The struct.calcsize() function can be used to determine the size of the struct format.

import struct

# Define a format string for a struct
format_string = 'I f 3s'

# Get the size of the struct
size = struct.calcsize(format_string)
print(f"Size of the struct: {size} bytes")

Explanation:

struct.calcsize() returns the number of bytes needed to store the struct as defined by the format string. This can be helpful when you need to manage memory usage or align data correctly.

5. Working with Packed Binary Data for Networking

In networking applications, you may need to send and receive binary data. Here’s how to handle such scenarios:

Packing data for network transmission:

import struct

# Prepare data to send (integer, float, string)
data = (101, 3.14, b'hello')

# Pack the data into a binary string
packed_data = struct.pack('I f 5s', *data)

# Send this packed data over the network (simulated)
print(f"Packed Data: {packed_data}")

Unpacking received binary data:

import struct

# Simulate receiving packed binary data
received_data = packed_data  # In real use, this would come from a socket

# Unpack the data
unpacked_data = struct.unpack('I f 5s', received_data)
print(f"Unpacked Data: {unpacked_data}")

Explanation:

This is a simple simulation of how binary data might be packed for network transmission using the struct module and later unpacked when received.
The 5s format specifies a string of 5 characters, which is typically used in network protocols where the string length is fixed.

6. Packing and Unpacking Multiple Entries

You can pack and unpack multiple entries at once. This example shows how to handle multiple entries in a binary format.

import struct

# Data: list of integers and floats
data = [(1, 2.5), (2, 3.6), (3, 4.7)]

# Packing multiple entries
packed_data = struct.pack('I f' * len(data), *[item for sublist in data for item in sublist])
print(f"Packed Data: {packed_data}")

# Unpacking multiple entries
unpacked_data = struct.unpack('I f' * len(data), packed_data)
print(f"Unpacked Data: {unpacked_data}")

Explanation:

By repeating the format I f for each entry in the list, we can pack and unpack multiple records. Each record consists of an integer and a float.

7. Handling Big Endian and Little Endian Data

In some binary file formats or network protocols, you might encounter big-endian or little-endian byte orders.

import struct

# Big-endian (network byte order) packing
data = (1, 2.5)
packed_data = struct.pack('!I f', *data)  # '!' specifies network (big-endian) order
print(f"Packed Big-endian Data: {packed_data}")

# Unpacking big-endian data
unpacked_data = struct.unpack('!I f', packed_data)
print(f"Unpacked Big-endian Data: {unpacked_data}")

Explanation:

The format string ! is used to specify network (big-endian) byte order. You can use this when dealing with network protocols where byte order is standardized.

8. Working with Signed and Unsigned Integers

You can specify signed or unsigned integers while packing and unpacking data.

import struct

# Packing signed and unsigned integers
data = (123, -456)
packed_data = struct.pack('I i', data[0], data[1])
print(f"Packed Data: {packed_data}")

# Unpacking data
unpacked_data = struct.unpack('I i', packed_data)
print(f"Unpacked Data: {unpacked_data}")

Explanation:

I represents an unsigned integer (4 bytes), and i represents a signed integer (also 4 bytes).
This is useful when you need to handle both positive and negative numbers in your binary data.

9. Handling Floats with Precision

You can control the precision of floating-point numbers when packing and unpacking them.

import struct

# Packing a float with specified precision
data = (3.1415926535,)
packed_data = struct.pack('d', data[0])  # 'd' specifies double precision float (8 bytes)
print(f"Packed Data: {packed_data}")

# Unpacking data
unpacked_data = struct.unpack('d', packed_data)
print(f"Unpacked Data: {unpacked_data}")

Explanation:

The d format specifies a double-precision floating-point number (8 bytes). This is useful for applications requiring high precision, such as scientific computing.

10. Padding with `struct`

If you need to add padding between struct elements for alignment, the struct module allows you to do this with a specific format.

import struct

# Packing data with padding
data = (1, b'hello')
packed_data = struct.pack('I 5s', *data)  # 5-byte string with padding
print(f"Packed Data with Padding: {packed_data}")

# Unpacking data with padding
unpacked_data = struct.unpack('I 5s', packed_data)
print(f"Unpacked Data with Padding: {unpacked_data}")

Explanation:

In this example, padding occurs automatically for the 5-byte string field. If the size of an element does not match its alignment requirements, Python will add padding to ensure proper alignment.

These examples demonstrate how to work with binary data using Python's struct module, which is an essential tool for working with binary files, network protocols, and performance-critical applications.

Previous18. Global Interpreter Lock (GIL)Next20. Custom Python REPL

Last updated 1 year ago

hashtagKey Functions of struct Module:

hashtag1. Packing and Unpacking Data

hashtag2. Reading and Writing Binary Files

hashtag3. Handling Binary Data with Variable Length Strings

hashtag4. Using calcsize to Determine Structure Size

hashtag5. Working with Packed Binary Data for Networking

hashtag6. Packing and Unpacking Multiple Entries

hashtag7. Handling Big Endian and Little Endian Data

hashtag8. Working with Signed and Unsigned Integers

hashtag9. Handling Floats with Precision

hashtag10. Padding with struct

Key Functions of `struct` Module:

1. Packing and Unpacking Data

2. Reading and Writing Binary Files

3. Handling Binary Data with Variable Length Strings

4. Using `calcsize` to Determine Structure Size

5. Working with Packed Binary Data for Networking

6. Packing and Unpacking Multiple Entries

7. Handling Big Endian and Little Endian Data

8. Working with Signed and Unsigned Integers

9. Handling Floats with Precision

10. Padding with `struct`