2.2.3. Data Types¶
Contents
2.2.3.1. Address¶
- Type:
addr
- Example constants:
192.168.1.1
,[2001:db8:85a3:8d3:1319:8a2e:370:7348]
,[::1]
- Addresses are passed by value.
The addr
type stores IP addresses. It handles IPv4 and IPv6
addresses transparently. Note that IPv6 constants need to be enclosed
in brackets. For a given addr
instance,
family()
retrieves the family.
Operators
-
address
==
address¶ Compares two address values, returning
True
if they are equal.
Methods
-
(typeX by(
type by name:spicy::AddrFamily) family (
)¶ Returns the IP family of an address value.
2.2.3.2. Bool¶
- Type:
bool
- Example constants:
True
,False
- Booleans are passed by values.
[TODO: Overview]
Operators
-
bool
==
bool¶ Compares two boolean values, returning
True
if they are equal.
-
bool
&&
bool¶ Returns the logical “and” of two booleans.
-
bool
||
bool¶ Returns the logical “or” of two booleans.
-
!
bool¶ Negates a boolean value.
Methods
None defined.
2.2.3.3. Bytes¶
[TODO: Overview]
Operators
-
bytes
==
bytes¶ Compares two
bytes
values.
-
bytes
+
bytes¶ Concatenates two
bytes
values.
-
bytes
+=
bytes¶ Appends a
bytes
value to another one.
-
|
bytes|
¶ Returns the length of the
bytes
instance.
Methods
-
begin(
)¶
Returns an iterator pointing to the initial element.
-
decode(
charset: enum {
)¶ Interprets the
bytes
as representing an binary string encoded with the given character set, and converts it into a UTF8 string
-
end(
)¶
Returns an iterator pointing one beyond the last element.
-
join(
l: list
)¶ Renders the elements of l into textual form and joins them into a single bytes object using the given one as separator.
-
lower(
)¶
Returns a lower-case version.
-
match(
r: regexp, n: [ int ]
)¶ Matches the
bytes
object against the regular expression r. Returns the matching part, or if n is given the corresponding subgroup within r.
-
split(
sep: [ bytes ]
)¶ Splits at each occurence of
sep
, returning a vector ofbytes
representing each piece excluding the separators. If sep is skipped, the default is to split at any sequence of white-space.
-
split1(
sep: [ bytes ]
)¶ Splits at the first occurence of
sep
, returning a pair ofbytes
representing everything before and afterwards, respectively. If sep is skipped, the default is to split at any sequence of white-space.
-
startswith(
b: bytes
)¶ Returns true if the
bytes
objects begins with b.
-
strip(
side: [ enum { ], chars: [ bytes ]
)¶ Strips off leading and/or trailing characters, as indicated by side with either if not given. By default it strips off all whitespace; alternatively any characters contained in chars.
-
to_int(
base: [ uint<64> ]
)¶ Interprets the
bytes
as representing an ASCII-encoded number and converts it into a signed integer, using a base of base. If base is not given, the default is 10.
-
to_int(
byte_order: enum {
) Interprets the
bytes
as representing an binary number encoded with the given byte order, and converts it into a signed integer.
-
to_time(
base: [ uint<64> ]
)¶ Interprets the
bytes
as representing a number of seconds since the epoch in the form of an ASCII-encoded number and converts it into a time value, using a base of base. If base is not given, the default is 10.
-
to_time(
byte_order: enum {
) Interprets the
bytes
as representing as number of seconds since the epoch in the form of an binary number encoded with the given byte order, and converts it into a time value.
-
to_uint(
base: [ uint<64> ]
)¶ Interprets the
bytes
as representing an ASCII-encoded number and converts it into an unsigned integer, using a base of base. If base is not given, the default is 10.
-
to_uint(
byte_order: enum {
) Interprets the
bytes
as representing an binary number encoded with the given byte order, and converts it into an unsigned integer.
-
upper(
)¶
Returns an upper-case version.
2.2.3.4. Double¶
[TODO: Overview]
Operators
-
cast<
int>(
double)
¶ Casts a double into an integer value, truncating any fractional value.
- double coerces to bool¶
Doubles coerce to boolean, returning true if the value is non-zero.
-
double
/
double¶ Divides two doubles.
-
double
==
double¶ Compares to doubles.
-
double
>
double¶ Returns whether the first double is larger than the second.
-
double
<
double¶ Returns whether the first double is smaller than the second.
-
double
-
double¶ Subtracts two doubles.
-
double
mod
double¶ Returns the remainder of a doubles’ division.
-
double
*
double¶ Multiplies two doubles.
-
double
+
double¶ Adds two doubles.
-
double
**
double¶ Raises a double to a given power.
Methods
None defined.
2.2.3.5. Enum¶
[TODO: Overview]
Operators
- t:type(any)¶
Converts an integer into an enum.
-
cast<
int>(
enum{)
¶ Casts an enum into an integer, returning a value that is consistent and unique among all labels of the enum’s type.
- enum{ coerces to bool¶
Enums coerce to boolean, returning true if the value corresponds to a known label.
-
enum{
==
enum{¶ Compared two boolean values.
Methods
None defined.
2.2.3.6. Function¶
[TODO: Overview]
Operators
- t:function():void(any)¶
Calls a function.
Methods
None defined.
2.2.3.7. Integer¶
[TODO: Overview]
Operators
-
int
&
int¶ Computes the bitwise and of two integers.
-
int
|
int¶ Computes the bitwise or of two integers.
-
int
^
int¶ Computes the bitwise xor of two integers.
-
cast<
int>(
int)
¶ Casts an integer into a different integer type, extending/truncating as needed.
-
cast<
interval>(
int)
Casts an unsigned integer into an interval, interpreting the value as seconds.
-
cast<
time>(
int)
Casts an unsigned integer into a time, interpreting the value as seconds since the epoch.
- int coerces to bool¶
Integers coerce to boolean, returning true if the value is non-zero.
- int coerces to double
Unsigned integers coerce to doubles.
- int coerces to int
Integers coerce to other integer types if their signedness match and their width is larger or equal.
- int coerces to interval
Unsigned integers coerce to intervals.
-
int
/
int¶ Divides two integers.
-
int
==
int¶ Compares two integer for equality.
-
int
>
int¶ Returns whether the first integer is larger than the second.
-
int
<
int¶ Returns whether the first integer is smaller than the second.
-
int
-
int¶ Subtracts two integers.
-
int
+=
int¶ Decreases an integer by a given amount.
-
int
mod
int¶ Returns the remainder of a integers’ division.
-
int
*
int¶ Multiplies two integers.
-
int
+
int¶ Adds two integers.
-
int
+=
int¶ Increases an integer by a given amount.
-
int
**
int¶ Raises an integer to a given power.
-
int
<<
int¶ Shifts an integer left by a given number of bits.
-
int
>>
int¶ Shifts an integer right by a given number of bits.
Methods
None defined.
2.2.3.8. Interval¶
[TODO: Overview]
Operators
-
cast<
double>(
interval)
¶ Casts a interval into a double.
-
cast<
int>(
interval)
Casts a interval into an integer value, truncating any fractional value.
- interval coerces to bool¶
Intervals coerce to boolean, returning true if the value is non-zero.
-
interval
==
interval¶ Compares to intervals.
-
interval
>
interval¶ Returns whether the first interval is larger than the second.
-
interval
<
interval¶ Returns whether the first interval is smaller than the second.
-
interval
-
interval¶ Subtracts two intervals.
-
interval
*
int¶ Multiplies an interval with an integer.
-
int
*
interval Multiplies an integer with a interval.
-
interval
+
interval¶ Adds two intervals.
Methods
-
nsecs(
)¶
Returns the interval as nanoseconds.
2.2.3.9. Iterator¶
[TODO: Overview]
Operators
-
*
iterator¶ Returns the element referenced by the iterator.
-
iterator
==
iterator¶ Compares two iterators.
-
iterator
++
¶ Advances the iterator by one element, returning the previous iterator.
-
--
iterator¶ Advances the iterator by one element, returning the new iterator.
-
iterator
+
int¶ Returns an iterator advanced by a given number of elements.
-
iterator
+=
int¶ Advances the iterator by a given number of elements.
Methods
None defined.
2.2.3.10. List¶
[TODO: Overview]
Operators
-
list
+=
list¶ Appends a lsit value to another one.
-
|
list|
¶ Returns the length of the list.
Methods
-
push_back(
elem: any
)¶ Appends an element to the list.
2.2.3.11. Map¶
[TODO: Overview]
Operators
-
delete
map<*,*>[
any]
¶ Deletes an element from the map. If the element does not exist, there’s no effect.
-
any
in
map<*,*>¶ Returns true if there’s a map element with the given index.
-
map<*,*>
[
any]
¶ Returns the map element at the given index.
-
map<*,*>
[
any]
=
any¶ Assigns an element to the given index of the map. Any already existing element will be overwritten.
-
|
map<*,*>|
¶ Returns the number of elements in the map.
Methods
-
clear(
)¶
Removes all elements from the map.
-
get(
index: any, default: [ any ]
)¶ Returns the map element at the given index. If the element does not exist, default is returned if given.
2.2.3.12. Set¶
[TODO: Overview]
Operators
-
add
set[
any]
¶ Adds an element to the set.
-
delete
set[
any]
¶ Deletes an element from the set. If the element does not exist, there’s no effect.
-
any
in
set¶ Returns true if the element is a member of the set.
-
|
set|
¶ Returns the number of elements in the set.
Methods
-
clear(
)¶
Removes all elements from the set.
2.2.3.13. Sink¶
[TODO: Overview]
Operators
-
new
sink¶ Instantiates a new sink.
-
|
sink|
¶ Returns the number of bytes written into the sink so far. If the sink has filters attached, this returns the value after filtering.
Methods
-
add_filter(
t: enum {
)¶ Adds an input filter as specificed by t (of type ~~Spicy::Filter) to the sink. The filter will receive all input written into the sink first, transform it according to its semantics, and then parser attached to the unit will parse the output of the filter. Multiple filters can be added to a sink, in which case they will be chained into a pipeline and the data is passed through them in the order they have been added. The parsing will then be carried out on the output of the last filter in the chain. Note that filters must be added before the first data chunk is written into the sink. If data has already been written when a filter is added, behaviour is undefined. Currently, only a set of predefined filters can be used; see ~~Spicy::Filter. One cannot define own filters in Spicy. Todo: We should probably either enables adding filters laters, or catch the case of adding them too late at run-time an abort with an exception.
-
close(
)¶
Closes a sink by disconnecting all parsing units. Afterwards, the sink’s state is as if it had just been created (so new units can be connected). Note that a sink it automatically closed when the unit it is part of is done parsing. Also note that a previously connected parsing unit can not be reconnected; trying to do so will still thrown an ~~UnitAlreadyConnected exception.
-
connect(
u: unit
)¶ Connects a parsing unit to a sink. All subsequent write() calls will pass their data to this parsing unit. Each unit can only be connected to a single sink. If the unit is already connected, a ~~UnitAlreadyConnected excpetion is thrown. However, a sink can have more than one unit connected.
-
connect_mime_type(
b: bytes
)¶ Connects a parsing unit to a sink. All subsequent write() calls will pass their data to this parsing unit. Each unit can only be connected to a single sink. If the unit is already connected, a ~~UnitAlreadyConnected excpetion is thrown. However, a sink can have more than one unit connected.
-
connect_mime_type(
b: string
) Connects a parsing unit to a sink. All subsequent write() calls will pass their data to this parsing unit. Each unit can only be connected to a single sink. If the unit is already connected, a ~~UnitAlreadyConnected excpetion is thrown. However, a sink can have more than one unit connected.
-
gap(
seq: uint<64>, len: uint<64>
)¶ Reports a gap in the input stream. seq is the sequence number of the first byte missing, len is the length of the gap. seq is relative to the sink’s initial sequence number, which defaults to zero.
-
sequence(
)¶
Returns the current sequence number of the sink’s input stream, which is one beyond all data that has been put in order and delivered so far. The returned value is relative to the sink’s initial sequence number, which defaults to zero.
-
set_auto_trim(
enabled: bool
)¶ Enables or disables auto-trimming. If enabled (which is the default) sink input data is trimmed automatically once in-order and procssed. See a trim() for more information about trimming. TODO: Disabling auto-trimming is not yet supported.
-
set_initial_sequence_number(
seq: uint<64>
)¶ Sets the sink’s initial sequence number. All sequence numbers given to other methods are interpreted relative to this one. By default, a sink’s initial sequence number is zero.
-
set_policy(
policy: enum {
)¶ Sets a sink’s reassembly policy for ambigious input. As long as data hasn’t been trimmed, a sink detects overlapping chunks. The policy decides how to handle ambigious overlaps. The default policy is a Spicy::ReassemblyPolicy::First, which resolved ambigiuities by taking the data from chunk that came first. TODO: a First is currently the only policy supported.
-
skip(
seq: uint<64>
)¶ Skips ahead in the input stream. seq is is the sequence number where to continue parsing, relative to the sink’s initial sequence number. If there’s still data buffered before that position, that will be ignored and, if auto-skip is on, also immediately deleted. If new data is passed in later before seq that will likewise be ignored. If the input stream is currently stuck inside a gap, and seq is beyond that gap, the stream will resume processing at seq.
-
trim(
seq: uint<64>
)¶ Deletes all data that’s still buffered internally up to seq. seq is relative to the sink’s initial sequence number, which defaults to zero. If processing the input stream hasn’t reached seq yet, it will also skip ahead to there. Trimming the input stream releases the memory, but means that the sink won’t be able to detect any further data mismatches. Note that by default, auto-trimming is enabled, which means all data is trimmed automatically once in-order and procssed.
-
try_connect_mime_type(
b: bytes
)¶ Connects a parsing unit to a sink. All subsequent write() calls will pass their data to this parsing unit. Each unit can only be connected to a single sink. If the unit is already connected, a ~~UnitAlreadyConnected excpetion is thrown. However, a sink can have more than one unit connected.
-
try_connect_mime_type(
b: string
) Connects a parsing unit to a sink. All subsequent write() calls will pass their data to this parsing unit. Each unit can only be connected to a single sink. If the unit is already connected, a ~~UnitAlreadyConnected excpetion is thrown. However, a sink can have more than one unit connected.
-
write(
b: bytes, seq: [ uint<64> ], len: [ uint<64> ]
)¶ Passes data on to all connected parsing units. Multiple write() calls act like passing incremental input in, the units parse them as if it were a single stream of data. If data is passed in out of order, it will be reassembled before passing on, according to the sequence number seq provided; seq is interpreted relative to the inital sequence number set with set_initial_sequence_number, or 0 if not otherwise set. If not sequence number is provided, the data is assumed to represent a chunk to be appended to the current end of the input stream. If len is provided, the data is assumed to represent that many bytes inside the sequence space; if not provided, len defaults to the length of b. If no units are connected, the call does not have any effect. If one parsing unit throws an exception, parsing of subsequent units does not proceed. Note that the order in which the data is parsed to which unit is undefined. Todo: The exception semantics are quite fuzzy. What’s the right strategy here?
2.2.3.14. Time¶
[TODO: Overview]
Operators
-
cast<
double>(
time)
¶ Casts a time into a double.
-
cast<
int>(
time)
Casts a time into an integer value, truncating any fractional value.
- time coerces to bool¶
Times coerce to boolean, returning true if the value is non-zero.
-
time
==
time¶ Compares to times.
-
time
>
time¶ Returns whether the first time is larger than the second.
-
time
<
time¶ Returns whether the first time is smaller than the second.
-
time
-
time¶ Subtracts two times.
-
time
-
interval Subtracts an interval from a time.
-
time
+
interval¶ Adds an interval to a time.
-
interval
+
time Adds an interval to a time.
Methods
-
nsecs(
)¶
Returns the time as nanoseconds.
2.2.3.15. Tuple¶
[TODO: Overview]
Operators
- coerces to ¶
Tuples coerce to other tupes if all their elements coerce individually.
-
==
¶ Compares two tuples for equality.
-
[
int]
¶ Returns the tuple element at a given index.
Methods
None defined.
2.2.3.16. Unit¶
[TODO: Overview]
Operators
-
unit
.
<attr>¶ Access a unit field.
-
unit
.
<attr>=any¶ Assign a value to a unit field.
- unit?.<attr>¶
Returns true if a unit field is set.
-
new
type¶ Instantiates a new parse object for a given unit type.
- unit.?<attr>¶
Returns the value of a unit field if it’s set; otherwise throws an Spicy::AttributeNotSet exception.
Methods
-
add_filter(
f: enum {
)¶ Adds an input filter of type ~~Spicy::Filter to the unit object. The filter will receive all parsed input first, transform it according to its semantics, and then the unit will parse the output of the filter. Multiple filters can be added to a parsing unit, in which case they will be chained into a pipeline and the data is passed through them in the order they have been added. The actual unit parsing will then be carried out on the output of the last filter in the chain. Note that filters must be added before the first data chunk is passed in. If parsing has alrady started when a filter is added, behaviour is undefined. Also note that filters can only be added to exported unit types. Currently, only a set of predefined filters can be used; see ~~Spicy::Filter. One cannot define own filters in Spicy (but one can achieve a similar effect with sinks.) Todo: We should probably either enables adding filters laters, or catch the case of adding them too late at run-time and abort with an exception.
-
backtrack(
)¶
Abort parsing at the current position and returns back to the most revent
&try
attribute. Turns into a parse error if there’s no&try
.
-
confirm(
)¶
Abort parsing at the current position and returns back to the most revent
&try
attribute. Turns into a parse error if there’s no&try
.
-
disable(
msg: string
)¶ Abort parsing at the current position and returns back to the most revent
&try
attribute. Turns into a parse error if there’s no&try
.
-
disconnect(
)¶
Disconnect the unit from its parent sink. The unit gets signaled a regular end of data, so if it still has input pending, that might be processed before the method returns. If the unit is not connected to a sink, the method does not have any effect.
-
input(
)¶
Returns an
iter<bytes>
referencing the first byte of the raw data for parsing the unit. This method must only be called while the unit is being parsed, and will throw anUndefinedValue
exception otherwise. Note that using this method requires the unit being parsed to fully buffer its input until finished. That may have a performance impact, in particular in terms of memory requirements since now the garbage collection may need to hold on to it significantly longer.
-
mime_type(
)¶
Returns the MIME type that was specified when the unit was instantiated (e.g., via ~~sink.connect_mime_type()). Returns an empty
bytes
object if none was specified. This method can only be called for exported types.
-
offset(
)¶
Returns the an c uint<64> offset of the current parsing position relative to the start of the current parsing unit. This method must only be called while the unit is being parsed, and will throw an
UndefinedValue
exception otherwise. Note that when being inside a field hook, the current parsing position will have already moved on to the start of the next field because the hook is only run after the current field has been fully parsed. On the other hand, if the method is called from an expression evaluated before the parsing of a field starts (such as in a field’s&length
attribute), the returned offset will reflect the beginning of that field. Note that using this method requires the unit being parsed to fully buffer its input until finished. That may have a performance impact, in particular in terms of memory requirements since now the garbage collection may need to hold on to it significantly longer.
-
set_position(
b: iterator<bytes>
)¶ Changes the position in the input stream to continue parsing from. The new position is a new
iter<bytes>
where subsequent parsing will proceed. Note this changes the position globally: all subsequent field will be parsed from the new position, including those of a potential higher-level unit this unit is part of. Returns aniter<bytes>
with the old position. Note that using this method requires the unit being parsed to fully buffer its input until finished. That may have a performance impact, in particular in terms of memory requirements since now the garbage collection may need to hold on to it significantly longer.
2.2.3.17. Vector¶
[TODO: Overview]
Operators
-
vector
[
int]
¶ Returns the vector element at a given index.
-
vector
[
int]
=
any¶ Assigns an element to the given index of the vector.
-
|
vector|
¶ Returns the length of the vector.
Methods
-
push_back(
elem: any
)¶ Appends an element to the vector.
-
reserve(
c: int
)¶ Resizes the vector to reserver a capacity of at least c. This shrinks the vector if c is smaller than the current size.