By "servers" I mean something quite specific in a theoretical sense: a server is something that serves a queue of transactions.
I don't want to get into the details, as the mathematics are very gnarly and I don't want to teach a couple of semesters on this; but here's a thumbnail sketch:
You want all of your servers (cpu, disk, memory, network, and user(s) are the most obvious) to be as close to 100% utilized as they can be while not increasing overall response time to an annoying degree. For an interactive system (where the users are the source of the transactions), that means that you want keystrokes and mouse clicks to be serviced instantaneously. Non-interactive transactions, such as producing a video, can generally tolerate a longer response time.
As I said before, if you can't drive all of your servers as close to 100% as possible then you have a bottleneck in your system and are wasting something you paid for. As far as the CPU goes, that's why multi-threaded programs are a good idea. Ideally you want at least one thread (including other programs) for each CPU core. You also don't want your other servers to be underutilized; that's why a mixture of applications tends to use the overall system more efficiently than a single program.
That's already more than most of you want to know, so I'll end with one final thought: every system has one bottleneck at every instant of every day.
Now, as to whether or not any given physical component can stand up to the load is a matter of engineering. For example, a CPU should be able to run at 100% forever. If it can't, then it wasn't designed properly. That might have been a marketing decision based upon the target price point, or it might have been some kind of mistake.
This message was edited 1 time. Last update was at Oct 27. 2011 12:02