File elemwise.h¶
Custom elementwise operations generator.
Defines
-
GE_SCALAR
0x0001¶ Argument is a scalar passed from the CPU, requires nd == 0.
-
GE_READ
0x0002¶ Array is read from in the expression.
-
GE_WRITE
0x0004¶ Array is written to in the expression.
-
GE_NOADDR64
0x0001¶ Don’t precompile kernels for 64-bits addressing.
-
GE_CONVERT_F16
0x0002¶ Convert float16 inputs to float32 for computation.
-
GE_BROADCAST
0x0100¶ Allow broadcasting of dimensions of size 1.
-
GE_NOCOLLAPSE
0x0200¶ Disable dimension collapsing (not recommended).
Typedefs
-
typedef struct _GpuElemwise
GpuElemwise
¶ Elementwise generator structure.
The contents are private.
Functions
-
GpuElemwise*
GpuElemwise_new
(gpucontext * ctx, const char * preamble, const char * expr, unsigned int n, gpuelemwise_arg * args, unsigned int nd, int flags)¶ Create a new GpuElemwise.
This will allocate and initialized a new GpuElemwise object. This object can be used to run the specified operation on different sets of arrays.
The argument descriptor name the arguments and provide their data types and geometry (arrays or scalars). They also specify if the arguments are used for reading or writing. An argument can be used for both.
The expression is a C-like string performing an operation with scalar values named according to the argument descriptors. All of the indexing and selection of the right values is handled by the GpuElemwise code.
- Return
- a new GpuElemwise object or NULL
- Parameters
ctx
: the context in which to run the operationspreamble
: code to be inserted before the kernel codeexpr
: the expression to computen
: the number of argumentsargs
: the argument descriptorsnd
: the number of dimensions to precompile forflags
: see GpuElemwise flags
-
void
GpuElemwise_free
(GpuElemwise * ge)¶ Free all storage associated with a GpuElemwise.
- Parameters
ge
: the GpuElemwise object to free.
-
int
GpuElemwise_call
(GpuElemwise * ge, void ** args, int flags)¶ Run a GpuElemwise on some inputs.
- Parameters
ge
: the GpuElemwise to runargs
: pointers to the arguments (must macth what was described by the argument descriptors)flags
: see GpuElemwise call flags
-
struct
gpuelemwise_arg
¶ - #include <elemwise.h>
Argument information structure for GpuElemwise.