Member-only story

FLOPS overview — The Backbone of Efficient Machine Learning

Venkatesh Subramanian
5 min readMar 7, 2025

--

Let’s take a step back and remind ourselves of the fundamental building blocks of Machine Learning (ML), Generative AI (Gen AI), and Natural Language Processing (NLP). Overwhelmed by the excitement of these cutting-edge technologies, it’s easy to forget that everything relies on one core foundation: numbers. From training complex models to generating human-like text, numerical computations are the unsung heroes driving these innovations.

At the heart of these computations are matrix multiplications, which are fundamental to deep learning and NLP. The efficiency of these operations is measured using FLOPS (Floating Point Operations Per Second), which tells us how many calculations a system can perform in one second

FLOPs are critical for understanding the

computational speed

memory usage

the overall efficiency of ML models

As models grow larger and more complex, optimizing FLOPs becomes essential to:

  1. Reduce training time
  2. Decrease resource consumption

FP16 vs FP32

Floating-point numbers are used to represent real numbers in ML. The two most common formats are FP16 (half-precision) and FP32 (single-precision).

FP16 uses 16 bits:
1 bit for the sign (positive or negative)
5 bits for the exponent (scale)
10 bits for the mantissa (precision)

FP32 uses 32 bits:
1 sign bit
8 exponent bits
23 mantissa bits

Ok.. Let's understand with an example

0.052 -> FP32 representation

The binary representation of 0.052 is 0.0000110101
Move the binary point to the right until there’s a single 1 to the left of the point: 1.10101 × 2⁻⁵
Here, the exponent is -5.
The number 0.052 is positive, so the sign bit is 0.
The actual exponent is -5. Add the bias (127 for FP32): -5 + 127 = 122
The binary representation of 122: 01111010
The normalized mantissa is 1.10101. Drop the leading 1 (implicit in FP16/FP32) and use the fractional part 10101
Pad with zeros to make it 23 bits: 1010100000000000000000
Here’s the combined FP32 representation:
Sign: 0
Exponent: 01111010
Mantissa…

--

--

Venkatesh Subramanian
Venkatesh Subramanian

Written by Venkatesh Subramanian

Product development & Engineering Leader| Software Architect | AI/ML | Cloud computing|https://www.linkedin.com/in/venkatesh-subramanian-377451b4/

No responses yet

Write a response