I'm interested in identifying overflowing values when adding unsigned 8-bit integers, and clamping the result to 0xFF:
__m128i m1 = _mm_loadu_si128(/* 16 8-bit unsigned integers */);
__m128i m2 = _mm_loadu_si128(/* 16 8-bit unsigned integers */);
__m128i m3 = _mm_adds_epu8(m1, m2);
I would be interested in performing comparison for "less than" on these unsigned integers, similar to _mm_cmplt_epi8
for signed:
__m128i mask = _mm_cmplt_epi8 (m3, m1);
m1 = _mm_or_si128(m3, mask);
If an "epu8" equivalent was available, mask
would have 0xFF
where m3[i] < m1[i]
(overflow!), 0x00
otherwise, and we would be able to clamp m1
using the "or", so m1
will hold the addition result where valid, and 0xFF
where it overflowed.
Problem is, _mm_cmplt_epi8
performs a signed comparison, so for instance if m1[i] = 0x70
and m2[i] = 0x10
, then m3[i] = 0x80
and mask[i] = 0xFF
, which is obviously not what I require.
Using VS2012.
I would appreciate another approach for performing this. Thanks!
See Question&Answers more detail:os