Universal functions (ufuncs) are highly-optimized, vectorized routines (implemented in C) that
apply operations elementwise over NumPy arrays. They support broadcasting and predictable type casting,
expose methods like reduce
, accumulate
, and outer
, and accept
where=
and out=
parameters for masked and in-place style computation.
Ufuncs replace slow Python loops with vectorized low-level loops, giving large speedups and lower memory
overhead. They make numeric code concise and readable while preserving control over dtype
,
broadcasting, and output buffers—ideal for data science, ML pre-processing, and scientific workloads.
Vectorization means expressing computations as array-wide operations (e.g., x * x + 1
) rather than
element-by-element Python loops. NumPy maps these expressions to ufuncs and optimized BLAS-level routines, cutting
Python overhead and improving cache efficiency—often by orders of magnitude on large arrays.
import numpy as np
x = np.array([1., 4., 9.])
np.sqrt(x) # elementwise sqrt; <class 'numpy.ufunc'>
a = np.array([1,2,3], dtype=np.float64)
b = np.array([10,20,30], dtype=np.float64)
np.add(a, b) # same as a + b (binary ufunc)
M = np.array([[1,2,3],[4,5,6]], dtype=np.float64) # (2,3)
v = np.array([10,20,30], dtype=np.float64) # (3,)
np.add(M, v) # v broadcasts across rows
np.multiply(M, 2) # scalar broadcast
x = np.array([1,2,3,4], dtype=np.int64)
np.add.reduce(x) # 10
np.multiply.reduce(x) # 24
np.add.accumulate(x) # [1, 3, 6, 10]
np.multiply.accumulate(x)# [1, 2, 6, 24]
np.multiply.outer([1,2,3],[10,20]) # 3x2 table
where=
and out=
parametersx = np.array([-1., 0., 4., 9.])
root = np.empty_like(x)
np.sqrt(x, where=(x >= 0), out=root) # masked compute
# root: [0., 0., 2., 3.]
y = np.array([1., 2., 3., 4.])
np.add(y, 5., out=y) # in-place style update
a = np.array([1, 2, 3], dtype=np.int16)
b = np.array([1.5, 2.5, 3.5], dtype=np.float32)
c = np.add(a, b) # upcasts to float32
c.dtype
d = np.empty_like(b, dtype=np.float64)
np.add(a, b, out=d) # control result dtype
np.frompyfunc
import numpy as np
def hypotenuse(a, b): # pure Python
return (a*a + b*b) ** 0.5
hypot_ufunc = np.frompyfunc(hypotenuse, 2, 1)
a = np.array([3, 5, 8])
b = np.array([4,12,15])
r = hypot_ufunc(a, b) # dtype=object
r = r.astype(np.float64) # cast if numeric needed
frompyfunc
preserves broadcasting but returns object
dtype and runs Python-level code (slower than native ufuncs).np.vectorize
vs np.frompyfunc
f = lambda x: x**2 + 1
vf = np.vectorize(f) # convenience wrapper
uf = np.frompyfunc(f, 1, 1)
x = np.arange(5)
vf(x), uf(x) # similar API; uf returns object dtype
Note: vectorize
is for API convenience; it does not make Python code inherently faster.
out=
to reuse buffers and reduce temporary allocations.where=
to skip invalid math and unnecessary work.dtype
(e.g., float32
) for speed/memory.tile
/repeat
.# Jupyter sketch (use %timeit)
x = np.random.rand(1_000_000).astype(np.float32)
# %timeit np.add(np.multiply(x, x), 1, out=np.empty_like(x))
Author
🎥 Join me live on YouTubePassionate about coding and teaching, I publish practical tutorials on PHP, Python, JavaScript, SQL, and web development. My goal is to make learning simple, engaging, and project‑oriented with real examples and source code.