NAAICE Low-Level API Documentation

This API documents structures and methods for client implementations. Server methods are documented on the Software-NAA page. The client (a host machine) requests RPCs on a remote machine, which can be either a host running our software or any API-compatible device such as an FPGA. Host client and server software implement a state machine model. In the error-free case, the client progresses through the states shown in the following figure:

../_images/states_client.png

Defines

TIMEOUT_RESOLVE_ROUTE

Timeout for resolving RDMA routes, in milliseconds.

POLLING_TIMEOUT

Timeout for poll functions, in milliseconds.

DEFAULT_TIMEOUT

Default timeout for blocking calls to the NAA, in seconds.

DEFAULT_RETRY_COUNT

Structs & Enums

enum naaice_communication_state

Connection state machine for NAA communication.

Defines the possible states of a communication session between the host and NAA. States represent the progression from connection establishment, through MRSP and data transfer, to completion.

Host states:

  • NAAICE_INIT: Starting state.

  • NAAICE_READY: Address resolved.

  • NAAICE_CONNECTED: Connection established.

  • NAAICE_MRSP_SENDING: Posting write for MRSP packet.

  • NAAICE_MRSP_RECEIVING: Waiting for / processing MRSP response from NAA.

  • NAAICE_MRSP_DONE: Finished MRSP.

  • NAAICE_DATA_SENDING: Posting write for data transfer to NAA.

  • NAAICE_DATA_RECEIVING: Waiting for / processing data transfer back from NAA.

  • NAAICE_FINISHED: Done!

NAA states:

  • NAAICE_INIT: Starting state.

  • NAAICE_CONNECTED: Connection established.

  • NAAICE_MRSP_RECEIVING: Waiting for / processing MRSP packet from host.

  • NAAICE_MRSP_SENDING: Posting write for MRSP packet.

  • NAAICE_MRSP_DONE: Finished MRSP.

  • NAAICE_DATA_RECEIVING: Waiting for / processing data transfer from host.

  • NAAICE_CALCULATING: Running RPC.

  • NAAICE_DATA_SENDING: Posting write for data transfer back to host.

  • NAAICE_FINISHED: Done!

Note

The state flow differs slightly between the host and the NAA.

Values:

enumerator NAAICE_INIT

Starting state.

enumerator NAAICE_READY

Address resolved (host only)

enumerator NAAICE_CONNECTED

Connection established.

enumerator NAAICE_DISCONNECTED

Connection disconnected.

enumerator NAAICE_MRSP_SENDING

Posting write for MRSP packet.

enumerator NAAICE_MRSP_RECEIVING

Waiting for / processing MRSP packet.

enumerator NAAICE_MRSP_DONE

Finished MRSP.

enumerator NAAICE_DATA_SENDING

Posting write for data transfer.

enumerator NAAICE_CALCULATING

Running RPC on NAA.

enumerator NAAICE_DATA_RECEIVING

Waiting for / processing data transfer.

enumerator NAAICE_FINISHED

Completed all communication.

enumerator NAAICE_ERROR

Error state.

struct naaice_communication_context
#include <naaice.h>

Communication context for NAA connections.

Holds all information about a connection, including RDMA resources, memory regions, current state, function codes, and transfer metadata. This struct is passed to almost all AP1 functions.

Public Functions

ATOMIC_TYPE (naaice_communication_state) state

Current communication state.

Public Members

struct rdma_cm_id *id

RDMA communication identifier.

struct rdma_event_channel *ev_channel

RDMA event channel.

struct ibv_context *ibv_ctx

IBV device context.

struct ibv_pd *pd

Protection domain for memory regions.

struct ibv_comp_channel *comp_channel

Completion channel.

struct ibv_cq *cq

Completion queue.

struct ibv_qp *qp

Queue pair.

double timeout

Timeout for operations, in seconds.

uint8_t retry_count

Retry count for RDMA connection.

uint16_t max_inline_data

Max byte number for data send inline ([User Buffer] ──CPU Copy──> [WQE] ──Network──> [Remote]) Inline copy is faster for small messages

struct naaice_mr_local *mr_local_data

Array of local memory regions.

uint8_t no_local_mrs

Number of local memory regions.

uint8_t mr_return_idx

Index of the return memory region (1..no_local_mrs), set by ::naaice_set_metadata.

struct naaice_mr_peer *mr_peer_data

Array of peer memory regions representing symmetric parameters.

uint8_t no_peer_mrs

Number of peer memory regions.

struct naaice_mr_local *mr_local_message

Local memory region used for MRSP messages.

struct naaice_mr_internal *mr_internal

Array of internal memory regions on NAA used for computation.

uint8_t no_internal_mrs

Number of internal memory regions.

uint8_t fncode

Function code specifying which NAA routine to call.

uint8_t naa_returncode

Return code indicating success or failure of NAA routine.

uint8_t rdma_writes_done

Number of RDMA writes performed to NAA.

uint32_t bytes_received

Number of bytes received from NAA.

uint8_t no_input_mrs

Number of input memory regions (parameters).

uint8_t no_output_mrs

Number of output memory regions (parameters).

unsigned int no_rpc_calls

Number of RPC calls performed on this connection.

union naaice_communication_context

32-bit immediate value sent during RDMA transfers.

Functions

int naaice_init_communication_context(struct naaice_communication_context **comm_ctx, uint64_t addr_offset, size_t *param_sizes, char **params, unsigned int params_amount, unsigned int internal_mr_amount, size_t *internal_mr_sizes, uint8_t fncode, const char *local_address, const char *remote_address, uint16_t remote_cm_port)

Initializes communication context struct.

After a call to this function, the provided communication context struct is ready to be passed to all other API functions.

Parameters:
  • comm_ctx[out] Double pointer to communication context struct to be initialized. Must not point to an existing struct; the struct is allocated and returned by this function.

  • param_sizes – Array of sizes (in bytes) of the provided routine parameters.

  • params – Array of pointers to parameter data. Must be preallocated by the host application.

  • params_amount – Number of parameters. Used to index param_sizes and params.

  • internal_mr_amount – Number of internal memory regions. Used to index internal_mr_sizes.

  • internal_mr_sizes – Array of sizes (in bytes) of the internal memory regions.

  • fncode – Function code specifying which NAA routine is called.

  • local_address – Local address string (e.g. “10.3.10.136”). Optional; if NULL, it is not used. Do not provide a local address when running in loopback mode.

  • remote_address – Remote address string (e.g. “10.3.10.135”).

  • remote_cm_port – Connection remote port. Must match with the local port of the server, either NAA or SW NAA.

Returns:

0 on success, -1 on failure.

int naaice_poll_connection_event(struct naaice_communication_context *comm_ctx, struct rdma_cm_event *ev, struct rdma_cm_event *ev_cp)

Polls for a connection event on the RDMA event channel.

Polls the RDMA event channel stored in the communication context. If a connection event is received, it is stored in the provided event pointer.

Parameters:
  • comm_ctx – Pointer to the communication context describing the connection to be polled.

  • ev[out] Pointer to the received RDMA connection event.

Returns:

0 on success (regardless of whether an event was received), -1 on failure.

int naaice_poll_and_handle_connection_event(struct naaice_communication_context *comm_ctx)

Poll and handle a connection event on the RDMA event channel.

Polls the RDMA event channel stored in the communication context and, if an event is received, dispatches it to the appropriate connection event handler. This function is a convenience wrapper that combines polling and handling using the previously defined poll and handler functions.

Parameters:

comm_ctx – Pointer to the communication context describing the connection.

Returns:

0 on success (regardless of whether an event was received), -1 on failure.

int naaice_setup_connection(struct naaice_communication_context *comm_ctx)

Set up the RDMA connection.

Repeatedly polls for and handles RDMA connection events until the connection setup is complete. This function is a blocking helper that internally calls naaice_poll_and_handle_connection_event.

Parameters:

comm_ctx – Pointer to the communication context describing the connection.

Returns:

0 on success, -1 on failure (e.g. due to timeout).

int naaice_register_mrs(struct naaice_communication_context *comm_ctx)

Register local memory regions.

Registers local memory regions as IBV memory regions using ::ibv_reg_mr. This includes memory regions corresponding to input and output parameters, the single metadata memory region, and the memory region used for MRSP.

If an error occurs during registration, the remote peer is notified via naaice_send_message.

Parameters:

comm_ctx – Pointer to the communication context describing the connection and associated memory regions.

Returns:

0 on success, -1 on failure.

int naaice_set_parameter_mrs(struct naaice_communication_context *comm_ctx, unsigned int n_parameter_mrs, uint64_t *local_addrs, uint64_t *remote_addrs, size_t *sizes)

Allocate and initialize parameter memory region descriptors.

Allocates and initializes internal structures for parameter (i.e. non-internal) memory regions. This function is called from naaice_init_communication_context.

Parameters:
  • comm_ctx – Pointer to the communication context describing the connection and memory regions.

  • n_parameter_mrs – Number of parameter memory regions for which descriptors will be allocated.

  • local_addrs – Array of local (user-space) addresses of the memory regions. This should correspond to the params array passed to naaice_init_communication_context.

  • remote_addrs – Array of remote addresses in NAA memory where the memory regions are requested to be stored. These addresses are obtained from the memory management service.

  • sizes – Array of sizes (in bytes) of the memory regions. This should correspond to the param_sizes array passed to naaice_init_communication_context.

Returns:

0 on success, -1 on failure.

int naaice_set_input_mr(struct naaice_communication_context *comm_ctx, unsigned int input_mr_idx)

Mark a memory region as input and/or output.

Adds information to the communication context indicating whether a memory region contains an input parameter, an output parameter, or both. Input memory regions are written to the NAA during data transfer, while output memory regions are written back from the NAA.

This function must be called before naaice_init_mrsp.

Mark a memory region as an input parameter.

Parameters:
  • comm_ctx – Pointer to the communication context describing the connection.

  • idx – Index of the memory region to be marked as input or output.

Returns:

0 on success, -1 on failure.

int naaice_set_output_mr(struct naaice_communication_context *comm_ctx, unsigned int output_mr_idx)

Mark a memory region as an output parameter.

int naaice_set_singlesend_mr(struct naaice_communication_context *comm_ctx, unsigned int singlesend_mr_idx)

Mark a memory region as a single-send region.

Adds information to the communication context indicating that a memory region is a single-send region. Single-send regions are written exactly once during the first RPC on a connection and are not written again during subsequent RPCs.

If the memory region is not already marked as an input region, it is automatically marked as one.

This function must be called before naaice_init_mrsp.

Parameters:
  • comm_ctx – Pointer to the communication context describing the connection.

  • idx – Index of the memory region to be marked as a single-send region.

Returns:

0 on success, -1 on failure.

int naaice_set_internal_mrs(struct naaice_communication_context *comm_ctx, unsigned int n_internal_mrs, uint64_t *addrs, size_t *sizes)

Add internal memory regions to the communication context.

Adds information about internal memory regions to the communication context. Internal memory regions exist only on the NAA side and are used for computation; their contents are not communicated during data transfer.

This function must be called before naaice_init_mrsp. The internal memory regions will be included in the memory region announcement message, indicating that they should be allocated by the NAA.

Should only be called once per communication context. Each call overwrites any previous internal memory region information.

Parameters:
  • comm_ctx – Pointer to the communication context describing the connection.

  • n_internal_mrs – Number of internal memory regions.

  • addrs – Array of addresses of the internal memory regions in NAA memory space. These addresses will be requested of the NAA during MRSP.

  • sizes – Array of sizes (in bytes) of the internal memory regions.

Returns:

0 on success, -1 on failure.

int naaice_set_immediate(struct naaice_communication_context *comm_ctx, uint8_t *imm_bytes)

Set the immediate value for data transfer.

Sets the immediate value to be written during data transfer. The immediate consists of up to 3 user-specified bytes placed in the upper 3 bytes; the lowest byte is reserved for the function code.

This function must be called before naaice_init_data_transfer.

Parameters:
  • comm_ctx – Pointer to the communication context describing the connection.

  • imm_bytes – Array of immediate value bytes. Maximum of 3 bytes.

Returns:

0 on success, -1 on failure.

int naaice_init_mrsp(struct naaice_communication_context *comm_ctx)

Initialize the Memory Region Setup Protocol (MRSP).

Starts the MRSP by sending advertise/request packets and posting a receive for the response. This prepares the connection for memory region registration and data transfer.

Parameters:

comm_ctx – Pointer to the communication context describing the connection and associated memory regions.

Returns:

0 on success, -1 on failure.

int naaice_init_data_transfer(struct naaice_communication_context *comm_ctx)

Start the data transfer to the NAA.

Initiates the data transfer by posting write operations for the memory regions to the NAA. Prepares the connection for remote computation using the previously registered memory regions.

Parameters:

comm_ctx – Pointer to the communication context describing the connection and associated memory regions.

Returns:

0 on success, -1 on failure.

int naaice_handle_work_completion(struct ibv_wc *wc, struct naaice_communication_context *comm_ctx)

Handle a single work completion from the completion queue.

Processes a single work completion event from the completion queue, which typically represents a memory region write either from the host to the NAA or from the NAA to the host.

Parameters:

comm_ctx – Pointer to the communication context describing the connection.

Returns:

0 on success, -1 on failure.

int naaice_poll_cq_nonblocking(struct naaice_communication_context *comm_ctx)

Poll the completion queue non-blocking.

Polls the completion queue for any work completions and processes them using naaice_handle_work_completion if any are received. After handling, updates ::comm_ctx->state to reflect the current state of the NAA connection and routine.

Parameters:

comm_ctx – Pointer to the communication context describing the connection.

Returns:

0 on success (regardless of whether any work completions were received), -1 on failure.

int naaice_poll_cq_blocking(struct naaice_communication_context *comm_ctx)

Poll the completion queue blocking.

Polls the completion queue for a work completion and blocks until at least one completion is available. Once a completion is received, it is processed using naaice_handle_work_completion. After handling, ::comm_ctx->state is updated to reflect the current state of the NAA connection and routine.

Parameters:

comm_ctx – Pointer to the communication context describing the connection.

Returns:

0 on success (regardless of whether any work completions were received), -1 on failure.

int naaice_poll_cq_busy(struct naaice_communication_context *comm_ctx)

Polls the completion queue blocking using busy waiting.

Polls the completion queue using busy waiting for a work completion and blocks until at least one completion is available. Once a completion is received, it is processed using naaice_handle_work_completion. After handling, ::comm_ctx->state is updated to reflect the current state of the NAA connection and routine.

Parameters:

comm_ctx – Pointer to the communication context describing the connection.

Returns:

0 on success (regardless of whether any work completions were received), -1 on failure.

int naaice_disconnect_and_cleanup(struct naaice_communication_context *comm_ctx)

Disconnect and clean up the communication context.

Terminates the RDMA connection and frees all memory associated with the communication context.

Parameters:

comm_ctx – Pointer to the communication context describing the connection.

Returns:

0 on success, -1 on failure.

int naaice_send_message(struct naaice_communication_context *comm_ctx, enum message_id message_type, uint8_t errorcode)

Send an MRSP message to the remote peer.

Sends an MRSP packet to the remote peer using ::ibv_post_send with opcode ::IBV_WR_SEND.

Parameters:
  • comm_ctx – Pointer to the communication context describing the connection.

  • message_type – Type of message to send. Should be one of MSG_MR_ERR, MSG_MR_AAR, or MSG_MR_A.

  • errorcode – Error code to include in the packet if message_type is MSG_MR_ERR. Ignored for other message types.

Returns:

0 on success, -1 on failure.

int naaice_write_data(struct naaice_communication_context *comm_ctx, uint8_t fncode)

Write memory regions to the NAA.

Writes memory regions, including metadata and input parameters, to the NAA. Uses ::ibv_post_send with opcode ::IBV_WR_RDMA_WRITE for regular regions or ::IBV_WR_RDMA_WRITE_WITH_IMM for the final memory region.

Parameters:
  • comm_ctx – Pointer to the communication context describing the connection.

  • fncode – Function code for the NAA routine. Must be positive; 0 indicates an error.

Returns:

0 on success, -1 on failure.

int naaice_post_recv_mrsp(struct naaice_communication_context *comm_ctx)

Post a receive request for an MRSP message.

Posts a receive request to the completion queue for an MRSP message. The request specifies the memory region to be written to (the MRSP region in this case).

Parameters:

comm_ctx – Pointer to the communication context describing the connection.

Returns:

0 on success, -1 on failure.

int naaice_post_recv_data(struct naaice_communication_context *comm_ctx)

Post a receive request for a memory region write.

Posts a receive request for a memory region write. Only the final memory region write (the one with an immediate value) requires a receive request; regular RDMA writes without an immediate do not consume a receive request.

The memory region specified in the receive request is the MRSP region. This is a placeholder, as the actual region written to is determined by the sender.

Parameters:

comm_ctx – Pointer to the communication context describing the connection.

Returns:

0 on success, -1 on failure.

int naaice_init_rdma_resources(struct naaice_communication_context *comm_ctx)

Initialize RDMA resources.

Allocates and initializes the necessary RDMA resources, including a protection domain, completion channel, completion queue, and queue pair.

Parameters:

comm_ctx – Pointer to the communication context describing the connection.

Returns:

0 on success, -1 on failure.

int naaice_do_mrsp(struct naaice_communication_context *comm_ctx)

Execute the MRSP protocol in a blocking manner.

Performs all steps of the Memory Region Setup Protocol (MRSP) in a blocking fashion, ensuring that memory regions are properly advertised, requested, and acknowledged before returning.

Parameters:

comm_ctx – Pointer to the communication context describing the connection.

Returns:

0 on success, -1 on failure.

int naaice_do_data_transfer(struct naaice_communication_context *comm_ctx)

Perform the complete data transfer in a blocking manner.

Executes all steps of the data transfer, including writing data to the NAA, waiting for the NAA to complete computation, and receiving the resulting data back. This function operates in a blocking fashion.

Parameters:

comm_ctx – Pointer to the communication context describing the connection.

Returns:

0 on success, -1 on failure.

int naaice_set_bytes_to_send(struct naaice_communication_context *comm_ctx, int mr_idx, int number_bytes)

Set the number of bytes to send from a memory region.

Specifies how many bytes should be sent from the given memory region during data transfer. Passing 0 resets the size to the original size of the memory region.

Parameters:
  • comm_ctx – Pointer to the communication context describing the connection.

  • mr_idx – Index of the memory region to be modified.

  • number_bytes – Number of bytes to send from the specified memory region. 0 resets to the original size.

Returns:

0 on success, -1 on failure.

Example

An example of the low-level API can be found in examples/naaice_client.c.