> 3. It's probably cheaper perform the HAS_ZERO check just once on (half1| half2). We have to compute (half1 | half2) anyway.
Wouldn't you have to check (half1 & half2) ?