asdf/file.h

This is the high-level public API for working with ASDF files. It includes functions for opening and closing ASDF file handles, represented by asdf_file_t pointers.

Most of these functions work on an open asdf_file_t* as their first argument, and retrieve scalar values and more complex objects out of the ASDF tree.

type asdf_file_t

An opaque struct representing an open ASDF file handle

Pointers to asdf_file_t are the primary interface to each open ASDF file and can be created and allocated with asdf_open, asdf_open_file, asdf_open_fp, or asdf_open_mem.

Configuration

enum asdf_block_decomp_mode_t

Options for decompression mode, for use with asdf_config_t

Todo

Document modes

Todo

When lazy is implemented there may likely be multiple implementations (userfaultfd, sigsegv, etc.). Add options to specify exactly which implementation to use, where ASDF_DECOMP_MODE_LAZY by itself will choose the most appropriate choice (generally userfaultfd if available)

enumerator ASDF_BLOCK_DECOMP_MODE_AUTO = 0

Automatically select the best mode

enumerator ASDF_BLOCK_DECOMP_MODE_EAGER

Force eager decompression

enumerator ASDF_BLOCK_DECOMP_MODE_LAZY

Force lazy decompression if possible

Lazy decompression is currently only implemented on recent-enough Linux versions (4.11+) that support the userfaultfd system call. If this option is passed on a system where it is not supported it will fall back to eager decompression.

struct asdf_config_t

Struct containing extended options to use when opening and reading files

For use with asdf_open_ex and relatives.

asdf_parser_cfg_t parser

Low-level parser configuration; see asdf_parser_cfg_t

asdf_emitter_cfg_t emitter

Low-level emitter configuration; see asdf_emitter_cfg_t

struct [anonymous]

Decompression options

asdf_block_decomp_mode_t mode

Decompression mode (see asdf_block_decomp_mode_t)

size_t max_memory_bytes

Max size in bytes of the decompressed data, above which decompression to disk will be used (see Compression)

double max_memory_threshold

Max percentage (from 0.0 to 1.0 of total system memory above which decompression to disk will be used (see Compression)

size_t chunk_size

Size in bytes of chunks to decompress at a time when using lazy decompression

Defaults to one page, and is always rounded up to the nearest page size.

const char *tmp_dir

Optional temporary directory path to use when decompressing to disk

File openers

asdf_open(...)

Opens an ASDF file for reading

This is a convenience macro for asdf_open_file, asdf_open_fp, or asdf_open_mem depending on the argument types

asdf_open_ex(source, ...)

Opens an ASDF file for reading with extended options

Extended version of asdf_open taking an optional pointer to asdf_config_t configuration options as the last argument, or NULL to use the default options (equivalent to asdf_open).

When passing in an asdf_config_t*, the config struct is copied:

  • This allows passing in the options from a local variable

  • Prevents modifications of the options while the file is open

  • In many cases you can leave options set to zero, and they will be filled in with defaults.

This is a convenience macro for asdf_open_file_ex, asdf_open_fp_ex, or asdf_open_mem_ex depending on the argument types

static inline asdf_file_t *asdf_open_file(const char *filename, const char *mode)

Opens an ASDF file for reading

Equivalent to asdf_open.

static inline asdf_file_t *asdf_open_fp(FILE *fp, const char *filename)

Opens an ASDF file from an already open FILE*

This assumes the file is open for reading.

Parameters:
  • fp – An open FILE*

  • filename – An optional filename for the open file. This need not be a real filesystem path, and can be any display name for the file; used mainly in error messages.

Returns:

An asdf_file_t*

static inline asdf_file_t *asdf_open_mem(const void *buf, size_t size)

Opens an ASDF file from an memory buffer

Parameters:
  • buf – An arbitrary block of memory from a void*

  • size – The size of the memory buffer

Returns:

An asdf_file_t*

asdf_write_to(file, ...)

Write the contents of an asdf_file_t to a destination

This is a type-generic macro that dispatches to one of the following based on the type and number of arguments after file:

  • asdf_write_to(file, filename) – where filename is a const char * or char *: calls asdf_write_to_file

  • asdf_write_to(file, fp) – where fp is a FILE *: calls asdf_write_to_fp

  • asdf_write_to(file, buf, size) – where buf is a void ** and size is a size_t *: calls asdf_write_to_mem

Parameters:
  • file – The asdf_file_t* to write

  • ... – Destination argument(s) – see above

Returns:

0 on success, non-zero on failure

int asdf_write_to_file(asdf_file_t *file, const char *filename)

Write the contents of the asdf_file_t to the given filesystem path

Parameters:
  • file – The asdf_file_t* to write

  • filename – Path to the output file; created or truncated as needed

Returns:

0 on success, non-zero on failure

int asdf_write_to_fp(asdf_file_t *file, FILE *fp)

Write the contents of the asdf_file_t to the given writeable FILE * stream

Parameters:
  • file – The asdf_file_t* to write

  • fp – An open, writeable FILE * stream

Returns:

0 on success, non-zero on failure

int asdf_write_to_mem(asdf_file_t *file, void **buf, size_t *size)

Write the contents of the asdf_file_t to a memory buffer

If *buf is non-NULL, a user-provided buffer is assumed and its size is read from *size. If the buffer is not large enough to hold the file, the output is truncated and a non-zero value is returned.

If *buf is NULL, a buffer is allocated with malloc() and a pointer to it is stored in *buf; the allocated size is written to *size. The caller is responsible for freeing the buffer with free().

Parameters:
  • file – The asdf_file_t* to write

  • buf – Address of a void * buffer pointer (in/out)

  • size – Address of a size_t holding the buffer size (in/out)

Returns:

0 on success, non-zero on failure

void asdf_close(asdf_file_t *file)

Closes an open asdf_file_t*, freeing associated resources where possible

Any other resources associated with that file handle, such as ndarrays, should no longer be expected to work and should ideally be freed before closing the file.

Parameters:
asdf_file_t *asdf_open_file_ex(const char *filename, const char *mode, asdf_config_t *config)

Opens an ASDF file for reading

Extended version of asdf_open taking an optional pointer to asdf_config_t configuration options, or NULL to use the default options (equivalent to asdf_open).

When passing in an asdf_config_t*, the config struct is copied:

  • This allows passing in the options from a local variable

  • Prevents modifications of the options while the file is open

  • In many cases you can leave options set to zero, and they will be filled in with defaults.

This is an alias for asdf_open_file_ex.

Parameters:
  • filename – A null-terminated string containing the local filesystem path to open

  • mode – Currently must always be just "r". This will support other opening modes in the future (e.g. for writes, updates).

  • config – A pointer to an asdf_config_t (may be partially initialized)

Returns:

An asdf_file_t*

asdf_file_t *asdf_open_fp_ex(FILE *fp, const char *filename, asdf_config_t *config)

Opens an ASDF file from an already open FILE*, with optional extended options

This assumes the file is open for reading.

Parameters:
  • fp – An open FILE*

  • filename – An optional filename for the open file. This need not be a real filesystem path, and can be any display name for the file; used mainly in error messages.

  • config – A pointer to an asdf_config_t (may be partially initialized)

Returns:

An asdf_file_t*

asdf_file_t *asdf_open_mem_ex(const void *buf, size_t size, asdf_config_t *config)

Opens an ASDF file from an memory buffer, with optional extended options

Parameters:
  • buf – An arbitrary block of memory from a void*

  • size – The size of the memory buffer

  • config – A pointer to an asdf_config_t (may be partially initialized)

Returns:

An asdf_file_t*

Error handling

const char *asdf_error(asdf_file_t *file)

Retrieve an error on a file

This is typically used to check for errors on the file itself, such as parse errors, and not for user data errors (such as invalid type conversions on an asdf_value_t).

If passed NULL, returns any global error, if set (typically from errors opening a file), or from library initialization.

See the section on Error handling for more details.

Parameters:
Returns:

NULL if there is no error set, otherwise a pointer to the error message string

asdf_error_code_t asdf_error_code(asdf_file_t *file)

Retrieve the error code set on a file

Returns ASDF_ERR_NONE if no error is set, or if file is NULL and there is no global error.

Parameters:
Returns:

The asdf_error_code_t for the current error

int asdf_error_errno(asdf_file_t *file)

Retrieve the saved OS errno from the last ASDF_ERR_SYSTEM error

Only meaningful when asdf_error_code returns ASDF_ERR_SYSTEM. Returns 0 when there is no system error or file is NULL and there is no global error.

Parameters:
Returns:

The saved errno value

Reading values

The following functions are the high-level interface for retrieving typed values out of the ASDF metadata tree. These include plain scalar values, mappings, sequences, as tagged data structures that have a registered extension for handling them (this includes objects belonging to the ASDF core schema, such as core/history_entry or core/ndarray). The getters for schema-specific objects are not documented here, but follow the same patterns.

For each type that can be read out of the ASDF tree there is an asdf_is_<type> function which just checks the type and returns a bool. Then there is an asdf_get_<type> function. Each of these takes the asdf_file_t* as their first argument, then a YAML Pointer expression for the path within the tree to that value, and finally a pointer for the return value’s type. Each of these functions return their value by reference through an input argument. The return value is always asdf_value_err_t.

If the value exists and successfully converts to the requested type the return value is ASDF_VALUE_OK. There are other return values such as ASDF_VALUE_ERR_NOT_FOUND (the path simply does not exist) or ASDF_VALUE_ERR_TYPE_MISMATCH (a value exists at that path but is the wrong type). A few other more obscure errors can occur–see asdf_value_err_t.

The one exception to the above is asdf_get_value which simply returns the generic asdf_value_t* if the path exists, or NULL otherwise. See Reading values out of the ASDF tree for more details on generic values.

Todo

Add support for referencing ASDF schemas.

asdf_value_t *asdf_get_value(asdf_file_t *file, const char *path)

Get an arbitrary asdf_value_t* out of the tree

Parameters:
Returns:

An asdf_value_t* wrapping the value, or NULL if the path does not exist in the tree

bool asdf_is_mapping(asdf_file_t *file, const char *path)

Check if the value at the given tree path is a YAML mapping

Parameters:
Returns:

true if the value is a mapping, false if it is another type of value or if no value exists at that path.

asdf_value_err_t asdf_get_mapping(asdf_file_t *file, const char *path, asdf_mapping_t **out)

Get a mapping out of the ASDF tree

Note

Mappings are currently represented as generic asdf_value_t*, though if this function returns ASDF_VALUE_OK it is guaranteed to be a mapping. This function will also ignore tags, so that tagged objects like core/ndarray can be read as a raw YAML mapping.

Todo

In the future may add a dedicated typedef for mappings to make this more explicit.

Parameters:
Returns:

ASDF_VALUE_OK if the value exists and is a mapping, otherwise ASDF_VALUE_ERR_NOT_FOUND or ASDF_VALUE_ERR_TYPE_MISMATCH.

bool asdf_is_sequence(asdf_file_t *file, const char *path)

Check if the value at the given tree path is a YAML sequence

Parameters:
Returns:

true if the value is a sequence, false if it is another type of value or if no value exists at that path.

asdf_value_err_t asdf_get_sequence(asdf_file_t *file, const char *path, asdf_sequence_t **out)

Get a sequence out of the ASDF tree

Note

Sequences are currently represented as generic asdf_value_t*, though if this function returns ASDF_VALUE_OK it is guaranteed to be a sequence. Like asdf_get_mapping, this function will ignore tags, so that tagged sequences associated with an extension schema can be read as a raw YAML sequence.

Todo

In the future may add a dedicated typedef for sequences to make this more explicit.

Parameters:
Returns:

ASDF_VALUE_OK if the value exists and is a sequence, otherwise ASDF_VALUE_ERR_NOT_FOUND or ASDF_VALUE_ERR_TYPE_MISMATCH.

bool asdf_is_string(asdf_file_t *file, const char *path)

Check if the value at the given tree path is a string scalar

Note

libasdf adheres to the YAML Core Schema in the interpretation of scalar values. So here “is a string” means strictly not interpreted as any other data type (int, bool, etc.) under the YAML. This is the same convention used in many other programming languages like Python, etc.

To check if the value is simply a scalar of any type use asdf_is_scalar.

Parameters:
Returns:

true if the value is a string, false if it is another type of value or if no value exists at that path.

asdf_value_err_t asdf_get_string(asdf_file_t *file, const char *path, const char **out, size_t *out_len)

Get a string out of the ASDF tree

This version returns the string without a null terminator, and the length of the string into the out_len parameter. This employs zero-copy where possible, so the memory pointing to the string may become unusable once the file is closed.

Note

See the note about asdf_is_string. This only returns ASDF_VALUE_OK if the value exists and is strictly a string. For a more generic version that returns the raw text of a scalar see asdf_get_scalar.

Parameters:
  • file – The asdf_file_t* for the file

  • path – The YAML Pointer to the string

  • out – A const char** into which to return the string as a const char*

  • out_len – A size_t* into which to return the length of the string

Returns:

ASDF_VALUE_OK if the value exists and is a string, otherwise ASDF_VALUE_ERR_NOT_FOUND or ASDF_VALUE_ERR_TYPE_MISMATCH.

asdf_value_err_t asdf_get_string0(asdf_file_t *file, const char *path, const char **out)

Get a null-terminated string out of the ASDF tree

Like asdf_get_string but returns a null-terminated copy of the string.

Parameters:
  • file – The asdf_file_t* for the file

  • path – The YAML Pointer to the string

  • out – A const char** into which to return the string as a const char*

Returns:

ASDF_VALUE_OK if the value exists and is a string, otherwise ASDF_VALUE_ERR_NOT_FOUND or ASDF_VALUE_ERR_TYPE_MISMATCH.

bool asdf_is_scalar(asdf_file_t *file, const char *path)

Check if the value at the given tree path is a YAML scalar of any kind

Parameters:
Returns:

true if the value is a scalar, false if it is another type of value or if no value exists at that path.

asdf_value_err_t asdf_get_scalar(asdf_file_t *file, const char *path, const char **out, size_t *out_len)

Like asdf_get_string but returns the raw text of a scalar value as a string without interpretation under the YAML Core Schema.

This can be especially useful in the implementation of Extending libasdf with extension types to process tagged scalars.

Parameters:
  • file – The asdf_file_t* for the file

  • path – The YAML Pointer to the scalar

  • out – A const char** into which to return the scalar as a const char*

  • out_len – A size_t* into which to return the length of the scalar

Returns:

ASDF_VALUE_OK if the value exists and is a scalar, otherwise ASDF_VALUE_ERR_NOT_FOUND or ASDF_VALUE_ERR_TYPE_MISMATCH.

asdf_value_err_t asdf_get_scalar0(asdf_file_t *file, const char *path, const char **out)

Like asdf_get_scalar0 but returns a null-terminated string

Parameters:
  • file – The asdf_file_t* for the file

  • path – The YAML Pointer to the scalar

  • out – A const char** into which to return the scalar as a const char*

Returns:

ASDF_VALUE_OK if the value exists and is a scalar, otherwise ASDF_VALUE_ERR_NOT_FOUND or ASDF_VALUE_ERR_TYPE_MISMATCH.

bool asdf_is_bool(asdf_file_t *file, const char *path)

Check if the value at the given tree path is a boolean scalar

This returns true for the non-string (that is, unquoted) scalars true/True/TRUE, false/False/FALSE as well as ints 0 or 1 strictly.

Parameters:
Returns:

true if the value is a bool, false if it is another type of value or if no value exists at that path.

asdf_value_err_t asdf_get_bool(asdf_file_t *file, const char *path, bool *out)

Get a bool value out of the ASDF tree

See asdf_is_bool.

Parameters:
  • file – The asdf_file_t* for the file

  • path – The YAML Pointer to the string

  • out – A bool* into which to return the bool

Returns:

ASDF_VALUE_OK if the value exists and is a bool, otherwise ASDF_VALUE_ERR_NOT_FOUND or ASDF_VALUE_ERR_TYPE_MISMATCH.

bool asdf_is_null(asdf_file_t *file, const char *path)

Check if the value at the given tree path is null

This returns true for the unquoted scalars null/Null/NULL or ~ as well as empty values (e.g. if a mapping key is followed by nothing but whitespace).

There is no corresponding asdf_get_null as it would probably be useless.

Parameters:
Returns:

true if the value is null, false if it is another type of value or if no value exists at that path.

Integer getters

The following functions are the type checkers and getters for integer types.

When libasdf detects an integer scalar it assigns to it the smallest C integer type that can hold that value. For example the number 42 is typed as ASDF_VALUE_UINT8.

However, integer up-casting to larger integer types. Downcasting that would cause an overflow is not allowed. For example 42 can be cast to an int16, but -42 cannot be cast to a uint16.

Note

In practice, unless you know some schema expects a small integer for a value, you will mostly just want to use asdf_get_int64.

With the asdf_get_(uint)N getters the asdf_value_err_t return value may also be ASDF_VALUE_ERR_OVERFLOW if the value is an integer that is too large to represent in the requested type.

Big integers (greater than UINT64_MAX or less than INT64_MAX) are not supported–in fact the ASDF Standard expressly forbids writing them to ASDF files. Nevertheless it could be supported in the future if the need arises. In fact, technically the ASDF Standard disallows integers greater than INT64_MAX but here we do allow unsigned integers up to UINT64_MAX.

bool asdf_is_int(asdf_file_t *file, const char *path)

Check if the value at the given tree path is a integer scalar of any byte size

Parameters:
Returns:

true if the value is an integer, false if it is another type of value or if no value exists at that path.

bool asdf_is_int8(asdf_file_t *file, const char *path)

See Integer getters

asdf_value_err_t asdf_get_int8(asdf_file_t *file, const char *path, int8_t *out)

See Integer getters

bool asdf_is_int16(asdf_file_t *file, const char *path)

See Integer getters

asdf_value_err_t asdf_get_int16(asdf_file_t *file, const char *path, int16_t *out)

See Integer getters

bool asdf_is_int32(asdf_file_t *file, const char *path)

See Integer getters

asdf_value_err_t asdf_get_int32(asdf_file_t *file, const char *path, int32_t *out)

See Integer getters

bool asdf_is_int64(asdf_file_t *file, const char *path)

See Integer getters

asdf_value_err_t asdf_get_int64(asdf_file_t *file, const char *path, int64_t *out)

See Integer getters

asdf_get_int

Alias for asdf_get_int64

bool asdf_is_uint8(asdf_file_t *file, const char *path)

See Integer getters

asdf_value_err_t asdf_get_uint8(asdf_file_t *file, const char *path, uint8_t *out)

See Integer getters

bool asdf_is_uint16(asdf_file_t *file, const char *path)

See Integer getters

asdf_value_err_t asdf_get_uint16(asdf_file_t *file, const char *path, uint16_t *out)

See Integer getters

bool asdf_is_uint32(asdf_file_t *file, const char *path)

See Integer getters

asdf_value_err_t asdf_get_uint32(asdf_file_t *file, const char *path, uint32_t *out)

See Integer getters

bool asdf_is_uint64(asdf_file_t *file, const char *path)

See Integer getters

asdf_value_err_t asdf_get_uint64(asdf_file_t *file, const char *path, uint64_t *out)

See Integer getters

Float getters

Similarly to the integer getters the asdf_is_float method will return true if the floating point value can be represented as accurately in a 32-bit float as in a double (the mantissa and exponent are small).

Otherwise it is safe to asdf_is_double and asdf_get_double for most cases. The asdf_value_err_t return value can also be ASDF_VALUE_ERR_OVERFLOW if the number is too large to represent as an IEEE 64-bit float (in particular, if strtod sets errno = ERANGE).

bool asdf_is_float(asdf_file_t *file, const char *path)

See Float getters

asdf_value_err_t asdf_get_float(asdf_file_t *file, const char *path, float *out)

See Float getters

bool asdf_is_double(asdf_file_t *file, const char *path)

See Float getters

asdf_value_err_t asdf_get_double(asdf_file_t *file, const char *path, double *out)

See Float getters

bool asdf_is_extension_type(asdf_file_t *file, const char *path, asdf_extension_t *ext)

Extension object getters

Todo

Needs Extending libasdf with extension types documentation.

asdf_value_err_t asdf_get_extension_type(asdf_file_t *file, const char *path, asdf_extension_t *ext, void **out)

Todo

Needs Extending libasdf with extension types documentation.

asdf_value_err_t asdf_set_value(asdf_file_t *file, const char *path, asdf_value_t *value)

Writing values

Todo

Needs general explanation, to do in #114