Serialization of Cairo types
When you interact with contracts, especially if you are a library or SDK developer that wants to construct transactions, you need to understand how Cairo handles types that are larger than 252 bits so you can correctly formulate the calldata in a transaction.
The field element (felt252
), which contains 252 bits, is the only actual type in the Cairo VM. So all high-level Cairo types that are larger than 252 bits, such as u256
or arrays, are ultimately represented by a list of felts. In order to interact with a contract, you need to know how to encode its arguments as a list of felts so you can correctly formulate the calldata in the transaction.
SDKs, such as starknet.js, encode the calldata for you, so you can simply specify any type and the SDK properly formulates the calldata. For example, you donโt need to know that a u256
value is represented by two felt252
values. You can simply specify a single integer in your code, and the SDK takes care of the serialization and encoding.
Data types of 252 bits or less
The following types are smaller than 252 bits. For these types, each value is serialized as a single-member list that contains one felt252
value.
-
ContractAddress
-
EthAddress
-
StorageAddress
-
ClassHash
-
Unsigned integers smaller than 252 bits:
u8
,u16
,u32
,u64
,u128
, andusize
-
bytes31
-
felt252
-
Signed integers smaller than 252 bits:
i8
,i16
,i32
,i64
, andi128
.A negative value, \(-x\), is serialized as \(P-x\), where:
\[P = 2^{251} + 17*2^{192} + 1\]For example,
-5
is serialized as \(P-5\). For more information on the value of \(P\), see The STARK field.
Data types greater than 252 bits
The following Cairo data types have non-trivial serialization:
-
u256
andu512
-
arrays
-
enums
-
structs
-
ByteArray
, which represents strings
Serialization of unsigned integers
Among unsigned integers, only u256
and u512
have non-trivial serialization.
Serialization of u256
values
A u256
value in Cairo is represented by two felt252
values, as follows:
-
The first
felt252
value contains the 128 least significant bits, usually referred to as the low part of the originalu256
value. -
The second
felt252
value contains the 128 most significant bits, usually referred to as the high part of the originalu256
value.
For example:
-
A
u256
variable whose decimal value is2
is serialized as[2,0]
. To understand why, examine the binary representation of2
and split it into two 128-bit parts, as follows:\[\underbrace{0\cdots0}_{\text{128 high bits}} | \underbrace{0\cdots10}_{\text{128 low bits}}\] -
A
u256
variable whose decimal value is2128
is serialized as[0,1]
. To understand why, examine the binary representation of2128
and split it into two 128-bit parts, as follows:\[\underbrace{0\cdots01}_{\text{128 high bits}} | \underbrace{0\cdots0}_{\text{128 low bits}}\] -
A
u256
variable whose decimal value is2129+2128+20
, is serialized as[20,3]
. To understand why, examine the binary representation of the2129+2128+20
and split it into two 128-bit parts, as follows:\[\underbrace{0\cdots011}_{\text{128 high bits}} | \underbrace{0\cdots10100}_{\text{128 low bits}}\]
Serialization of arrays
Arrays are serialized as follows:
<array_length>, <serialized_member_0>,…, <serialized_member_n>
For example, consider the following array of u256
values:
let POW_2_128: u256 = 0x100000000000000000000000000000000
let array: Array<u256> = array![10, 20, POW_2_128]
Each u256
value in the array is represented by two felt252
values. So the array above is serialized as follows:
Combining the above, the array is serialized as follows: [3,10,0,20,0,0,1]
Serialization of enums
An enum is serialized as follows:
<index_of_enum_variant>,<serialized_variant>
Note that enum variants indices are 0-based, not to confuse with their storage layout, which is 1-based, to distinguish the first variant from an uninitialized storage slot.
Consider the following definition of an enum named Week
:
enum Week {
Sunday: (), // Index=0. The variant type is the unit type (0-tuple).
Monday: u256, // Index=1. The variant type is u256.
}
Now consider instantiations of the Week
enum’s variants as shown in the table below:
Instance | Description | Serialization |
---|---|---|
|
Index= |
|
|
Index= |
|
Consider the following definition of an enum named MessageType
:
enum MessageType {
A,
#[default]
B: u128,
C
}
Now consider instantiations of the MessageType
enum’s variants as shown in the table below:
Instance | Description | Serialization |
---|---|---|
|
Index= |
|
|
Index= |
|
|
Index= |
|
As you can see about, the #[default]
attribute does not affect serialization. It only affects the storage layout of MessageType
, where the default variant
B
will be stored as 0
.
Serialization of structs
You serialize a struct by serializing its members one at a time.
Its members are serialized in the order in which they appear in the definition of the struct.
For example, consider the following definition of the struct MyStruct
:
struct MyStruct {
a: u256,
b: felt252,
c: Array<felt252>
}
The serialization is the same for both of the following instantiations of the struct’s members:
|
|
The serialization of MyStruct
is determined as shown in the table Serialization for a struct in Cairo.
Member | Description | Serialization |
---|---|---|
|
For information on serializing |
[ |
|
One |
|
|
An array of three |
[ |
Combining the above, the struct is serialized as follows: [0,2,5,3,1,2,3]
Serialization of byte arrays
A string is represented in Cairo as a ByteArray
type. A byte array is actually a struct with the following members:
-
data: Array<felt252>
Contains 31-byte chunks of the byte array. Eachfelt252
value has exactly 31 bytes. If the number of bytes in the byte array is less than 31, then this array is empty. -
pending_word: felt252
The bytes that remain after filling thedata
array with full 31-byte chunks. The pending word consists of at most 30 bytes. -
pending_word_len: usize
The number of bytes inpending_word
.
Consider the string hello
, whose ASCII encoding is the 5-byte hex value 0x68656c6c6f
. The resulting byte array is serialized as follows:
0, // Number of 31-byte words in the data array.
0x68656c6c6f, // Pending word
5 // Length of the pending word, in bytes
Consider the string Long string, more than 31 characters.
, which is represented by the following hex values:
-
0x4c6f6e6720737472696e672c206d6f7265207468616e203331206368617261
(31-byte word) -
0x63746572732e
(6-byte pending word)
The resulting byte array is serialized as follows:
1, // Number of 31-byte words in the array construct.
0x4c6f6e6720737472696e672c206d6f7265207468616e203331206368617261, // 31-byte word.
0x63746572732e, // Pending word
6 // Length of the pending word, in bytes
Additional resources
-
Integer types in The Cairo Programming Language.