Regarding thread starter: There are people who make a point of studying "optimal" ISAs. You can look for discussions related to the invention of the RISC-V architecture to find out people looking how to do this. There's also the "Mill" architecture, which is doing some really wacky things ... so wacky it's hard to evaluate. (But the inventor's lecture series on youtube on how the Mill architecture works does present some fascinating ideas)
pubby wrote:
And less instructions (RISC) tends to be better than more.
Seemed to be.
RISC is the right way to get "any computer" for cheapest, but the past twenty years have shown that simpler and more orthogonal instruction sets are actually not very useful for performance. The silicon cost of superscalar architectures eats a large amount of die space.
Ideally, you'd have an ISA with an infinite number of registers, and every instruction has complete orthogonality, and every instruction fits in 0 bits. Something has to give for a real-world thing; orthogonality is often a first victim. Fancy-seeming instructions (like bit test and bit set), despite being redundant with other instructions, are too critical for real-world application performance to not include as first-class operations. The performance of signal processing instructions (such as multiply-and-accumulate, or various SIMD things) is also paramount. Before long you discover you've built a weird CISC ISA that sure isn't RISC by anything but the most generous definitions of the term—it's just that it's a very different set of first-class instructions than the x86 or 68k had.