Data Format Conversion

2023-09-11

In Python, handling binary data is a common operation. MicroPython provides two modules, ustruct and ubinascii, for packing, unpacking, encoding, and decoding binary data. This article will introduce the functions of the ustruct and ubinascii modules and provide some usage examples.

ustruct

The ustruct module is used for handling binary data in MicroPython. It can convert Python data types to binary data and vice versa. The ustruct module provides five functions: pack, unpack, calcsize, pack_into, and unpack_from.

1.1 Format Table Supported by ustruct

The formats supported by ustruct are listed in the table below:

Format	C Type	Python	Bytes
`x`	pad byte	no value	1
`c`	`char`	string of length 1	1
`b`	`signed char`	integer	1
`B`	`unsigned char`	integer	1
`?`	`_Bool`	bool	1
`h`	`short`	integer	2
`H`	`unsigned short`	integer	2
`i`	`int`	integer	4
`I`	`unsigned int`	integer or long	4
`l`	`long`	integer	4
`L`	`unsigned long`	long	4
`q`	`long long`	long	8
`Q`	`unsigned long long`	long	8
`f`	`float`	float	4
`d`	`double`	float	8
`s`	`char[]`	string	1
`p`	`char[]`	string	1
`P`	`void *`	long

Note 1: q and Q are only meaningful when the machine supports 64-bit operations.

Note 2: A number can be placed before each format to indicate the count.

Note 3: The s format represents a string of a certain length. 4s represents a string of length 4, but p represents a pascal string.

Note 4: P converts a pointer, and its length is related to the machine word length.

Note 5: P is used to represent pointer types. For Quectel modules, it occupies 4 bytes.

1.2 ustruct Alignment

To exchange data with C structures, it is necessary to consider that some C or C++ compilers use byte alignment, and that it is usually a 32-bit system (unit in 4 bytes). Therefore, ustruct converts the data according to the local machine byte order. The alignment can be changed by the first character in the format. The definitions are as follows:

Character	Byte order	Size	Alignment
@(default)	Native	Native	Native, aligned to 4 bytes
=	Native	Standard	None, aligned to original size
<	Little-endian	Standard	None, aligned to original size
>	Big-endian	Standard	None, aligned to original size
!	Network (big-endian)	Standard	None, aligned to original size

1.3 pack and unpack Functions

The pack function packs a sequence of Python values into a byte string according to the specified format (fmt), while the unpack function unpacks a byte string into a tuple according to the specified format string (fmt).

Here is an example of using the pack and unpack functions:

import ustruct

buf = ustruct.pack('4sI', b'MPYN', 12345)
print(buf)  # Output b'MPYN\x39\x30\x01\x00'

s, i = ustruct.unpack('4sI', buf)
print(s, i)  # Output b'MPYN' 12345

In this example, '4sI' represents packing and unpacking with a 4-byte string and an integer.

1.4 calcsize Function

The calcsize function returns the length of the byte string required for the specified format (fmt).

Here is an example of using the calcsize function:

import ustruct

size = ustruct.calcsize('4sI')
print(size)  # Output 8

In this example, '4sI' represents packing with a 4-byte string and an integer, requiring a length of 8 bytes.

1.5 pack_into Function

The pack_into function packs a sequence of Python values into a specified buffer (buffer) at a specified offset (offset) according to the specified format (fmt).

Here is an example of using the pack_into function:

import ustruct

buf = bytearray(8)
ustruct.pack_into('>hhl', buf, 0, 32767, -12345, 123456789)
print(buf)  # Output: b'\x7f\xff\xcf\xc7\x80\x8d\x05\xcb'

In this example, '>hhl' indicates the use of big-endian byte order to pack a 16-bit integer, a 32-bit integer, and a 32-bit signed integer into a byte string, and place them at offset 0 of the buffer.

1.6 unpack_from function

The unpack_from function starts processing the binary data and unpacks it into a tuple from the specified offset of a format string (fmt), and a byte string.

Here is an example of using the unpack_from function:

import ustruct

data = b'\x01\x02\x03\x04\x05\x06\x07\x08'
a, b = ustruct.unpack_from('HH', data, 2)
print(a, b)  # Output 0x0302 0x0405

In this example, 'HH' indicates the unpacking of two 16-bit unsigned integers. The unpack_from function unpacks the data starting from offset 2 in the byte string into two integers based on a format string.

1.7 Endianness Testing

MicroPython supports both big-endian and little-endian byte order when dealing with binary data. The byte order can be specified by the format string as either big-endian ('>') or little-endian ('<') for packing and unpacking.

Here is an example of using both big-endian and little-endian byte order:

import ustruct

buf = ustruct.pack('>Hl', 32767, 123456789)
print(buf)  # Output b'\x7f\xff\x05\xcd\x15\xcd\x5b\x07'

s, i = ustruct.unpack('<Hl', buf)
print(s, i)  # Output 258 123456789

In this example, '>Hl' indicates the use of big-endian byte order to pack a 16-bit integer and a 32-bit signed integer into a byte string, while '<Hl' indicates the use of little-endian byte order to unpack the same byte string into a 16-bit integer and a 32-bit signed integer.

1.8 Summary

The ustruct module is used for packing, unpacking, encoding and decoding binary data, supporting both big-endian and little-endian byte order.

Its main functions include pack, unpack, calcsize, pack_into and unpack_from.

The pack function packs a sequence of Python values into a byte string according to the specified format string.

The unpack function unpacks a byte string into a tuple

The calcsize function returns the length required for a byte string with the specified format

The pack_into function packs a sequence of Python values into a specified buffer according to the specified format,

The unpack_from function unpacks a byte string starting from a specified offset into a tuple

By using these functions, it is convenient to perform packing, unpacking, encoding and decoding operations on binary data.

ubinascii

The ubinascii module is used for encoding and decoding binary data. It provides various encoding and decoding methods, such as hexadecimal encoding and decoding, Base64 encoding and decoding, etc.

2.1 hexlify and unhexlify Functions

The hexlify function encodes a byte string into a hexadecimal string, and the unhexlify function decodes a hexadecimal representation string into a byte string.

Here is an example of using the hexlify and unhexlify functions:

import ubinascii

data = b'\x01\x02\x03\x04\x05\x06\x07\x08'
hexstr = ubinascii.hexlify(data)
print(hexstr)  # Output b'0102030405060708'

bytearr = ubinascii.unhexlify(hexstr)
print(bytearr)  # Output b'\x01\x02\x03\x04\x05\x06\x07\x08'

In this example, the hexlify function encodes the byte string b'\x01\x02\x03\x04\x05\x06\x07\x08' into the hexadecimal string b'0102030405060708', and the unhexlify function decodes the hexadecimal string into the byte string b'\x01\x02\x03\x04\x05\x06\x07\x08'.

2.2 b2a_base64 and a2b_base64 Functions

The b2a_base64 function encodes a byte string into a Base64 string, and the a2b_base64 function decodes a Base64 string into a byte string.

Here is an example of using the b2a_base64 and a2b_base64 functions:

import ubinascii

data = b'\x01\x02\x03\x04\x05\x06\x07\x08'
base64str = ubinascii.b2a_base64(data)
print(base64str)  # Output b'AQIDBAUGBwg='

bytearr = ubinascii.a2b_base64(base64str)
print(bytearr)  # Output b'\x01\x02\x03\x04\x05\x06\x07\x08'

In this example, the b2a_base64 function encodes the byte string b'\x01\x02\x03\x04\x05\x06\x07\x08' into the Base64 string b'AQIDBAUGBwg=', and the a2b_base64 function decodes the Base64 string into the byte string b'\x01\x02\x03\x04\x05\x06\x07\x08'.

2.3 Summary

The ubinascii module is used for encoding and decoding binary data.

Its main functions include hexlify, unhexlify, b2a_base64 and a2b_base64.

The hexlify function encodes a byte string into a hexadecimal string.

The unhexlify function decodes a hexadecimal string into a byte string.

The b2a_base64 function encodes a byte string into a Base64 string.

The a2b_base64 function decodes a Base64 string into a byte string.

By using these functions, it is convenient to perform encoding and decoding operations on binary data.

Memory Management

Log Output

QuecPython