Back to home page

DOS ain't dead

Forum index page

Log in | Register

Back to the forum
Board view  Mix view

CMOV (Developers)

posted by marcov, 23.03.2020, 12:52

(yeah, probably all irrelevant to the thread since the subject was mostly 16-bit only CPUs. But I originally started to answer from a size perspective, so just conclude this subthread)

I researched a bit more, and there seem several separate issues:

- cmov has a latency of 1 (AMD) or 2 (Intel) cycles, so if it forms a dependency chain with instructions coming after it. In that case the branched form might be more worthwhile if correct predicted and the opcodes are sufficiently fused.
- I found some references that using branches might confuse the branch-predictor, without many details, except a general advise to minimize branches.
- the older the cpu (superscalar ones, so p6+), the less inputs a single uop can have. Since cmov also depends on flags (it has two arguments + flags), in older CPUs (before Sandy Bridge) many combinations couldn't be a single uop. Even now there are more problems with e.g. indexed version (which take another input register). Probably Sandy Bridge raised that to 3 inputs because of the three-address AVX instructions.
- the exact dependencies depend also on the form (the used flags). Carry and other flags combined are separate dependencies, so if you need carry and another flag, uop fusion probably won't happen.

The current opinion seems to be to use cmov instructions unless there is a very clear dependency chain. Cmov seems to be put in the same group as the adc instruction.

 

Complete thread:

Back to the forum
Board view  Mix view
22049 Postings in 2034 Threads, 396 registered users, 269 users online (0 registered, 269 guests)
DOS ain't dead | Admin contact
RSS Feed
powered by my little forum