SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Advanced Micro Devices - Moderated (AMD) -- Ignore unavailable to you. Want to Upgrade?


To: Joe NYC who wrote (74163)3/10/2002 10:36:31 PM
From: Ali ChenRead Replies (3) | Respond to of 275872
 
Jozef, re: "calling conventions"

Differences in calling conventions are irrelevant since
in a normally designed application the most time
(90-95%) is spent in loops inside subroutines/objects,
and the call overhead is a tiny fraction of the
run time.

- Ali



To: Joe NYC who wrote (74163)3/11/2002 1:51:55 AM
From: pgerassiRead Replies (2) | Respond to of 275872
 
Dear Joe:

I write device drivers for Linux. There the callee only has to restore what registers it actually uses. Stack or memory operations are more than an order more costly than register based operations. In C, caller only has to clean up what it did. The return itself places the stack back to what it was prior to the call. That is why C passes stuff by value. These storage locations can be left alone and simply freed. It can be removed by changing the stack register contents which is a register operation, not a memory one. And this extends easily to variable parameter lists. The caller knows how much it passed but, the callee does not and must figure it out or be passed that info (more stack usage).

Since x86 uses preset registers for certain instructions, many more registers are touched by normal code than a RISC based system. Still most programmers simply call a routine that saves/restores all of the registers out of simplicity more than anything else. This does not (and many of us embedded coders do not) mean that we must save/restore all the registers. Many assembly routines I write, change as little registers as necessary to do the task to save on save/restore.

Thus, C is more efficient than Pascal in terms of performance.

Pete



To: Joe NYC who wrote (74163)3/11/2002 6:25:38 AM
From: Gopher BrokeRespond to of 275872
 
Let's be clear. Calling conventions are irrelevant to the discussion of saving/restoring a thread's registers and state information.

All that is required for a simple function call is that the caller and callee agree on the the conventions they use, which could range from saving everything to saving nothing. We have conventions like "C" and "Pascal" so that code generated by one compiler can be invoked by code generated by another compiler. No other reason.

Similarly, device drivers do not have to save and restore all registers on entry/exit, although many do just for convenience. They only need to restore the exact state when the interrupt occurred, which is only that set of registers they modify. (This is still more onerous than a regular function call, however, because calling conventions generally allow subroutines to use some work registers without having to restore them.)

The most interesting area of register save/restore is the area of preemptive thread switching. In this case, when each interrupt handler completes, the scheduler is invoked to determine if a new thread should be executed. If so, the scheduler must save and restore the entire thread context. All registers and flags must be saved/restored. In x86-32 the processor provides assistance in the form of a task state segment (TSS). Calling through a TSS will automatically store the current state in the running TSS and restore the state of the target TSS.

And finally, to return to the topic, in x86-64 it seems that task switching through the TSS is not supported in long mode. The scheduler must therefore manually save/restore the registers to/from the blocked thread's stack. I guess AMD determined that there was little benefit in having hardware assist (or perhaps that no OSs bother to use the TSS mechanism)?