File: callconv_x64.tex

package info (click to toggle)
nqp 2014.07-3
  • links: PTS, VCS
  • area: main
  • in suites: jessie, jessie-kfreebsd
  • size: 23,596 kB
  • ctags: 7,993
  • sloc: ansic: 22,689; java: 20,240; cpp: 4,956; asm: 3,976; perl: 950; python: 267; sh: 245; makefile: 14
file content (239 lines) | stat: -rw-r--r-- 11,496 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
%//////////////////////////////////////////////////////////////////////////////
%
% Copyright (c) 2007,2009 Daniel Adler <dadler@uni-goettingen.de>, 
%                         Tassilo Philipp <tphilipp@potion-studios.com>
%
% Permission to use, copy, modify, and distribute this software for any
% purpose with or without fee is hereby granted, provided that the above
% copyright notice and this permission notice appear in all copies.
%
% THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
% WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
% MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
% ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
% WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
% ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
% OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
%
%//////////////////////////////////////////////////////////////////////////////

% ==================================================
% x64
% ==================================================
\subsection{x64 Calling Convention}


\paragraph{Overview}

The x64 (64bit) architecture designed by AMD is based on Intel's x86 (32bit)
architecture, supporting it natively. It is sometimes referred to as x86-64,
AMD64, or, cloned by Intel, EM64T or Intel64.\\
On this processor, a word is defined to be 16 bits in size, a dword 32 bits
and a qword 64 bits. Note that this is due to historical reasons (terminology
didn't change with the introduction of 32 and 64 bit processors).\\
The x64 calling convention for MS Windows \cite{x64Win} differs from the
SystemV x64 calling convention \cite{x64SysV} used by Linux/*BSD/...
Note that this is not the only difference between these operating systems. The
64 bit programming model in use by 64 bit windows is LLP64, meaning that the C
types int and long remain 32 bits in size, whereas long long becomes 64 bits.
Under Linux/*BSD/... it's LP64.\\
\\
Compared to the x86 architecture, the 64 bit versions of the registers are
called rax, rbx, etc.. Furthermore, there are eight new general purpose
registers r8-r15.



\paragraph{\product{dyncall} support}

\product{dyncall} supports the MS Windows and System V calling convention.\\
\\



\subsubsection{MS Windows}

\paragraph{Registers and register usage}

\begin{table}[h]
\begin{tabular}{3 B}
\hline
Name                & Brief description\\
\hline
{\bf rax}           & scratch, return value\\
{\bf rbx}           & permanent\\
{\bf rcx}           & scratch, parameter 0 if integer or pointer\\
{\bf rdx}           & scratch, parameter 1 if integer or pointer\\
{\bf rdi}           & permanent\\
{\bf rsi}           & permanent\\
{\bf rbp}           & permanent, may be used ase frame pointer\\
{\bf rsp}           & stack pointer\\
{\bf r8-r9}         & scratch, parameter 2 and 3 if integer or pointer\\
{\bf r10-r11}       & scratch, permanent if required by caller (used for syscall/sysret)\\
{\bf r12-r15}       & permanent\\
{\bf xmm0}          & scratch, floating point parameter 0, floating point return value\\
{\bf xmm1-xmm3}     & scratch, floating point parameters 1-3\\
{\bf xmm4-xmm5}     & scratch, permanent if required by caller\\
{\bf xmm6-xmm15}    & permanent\\
\hline
\end{tabular}
\caption{Register usage on x64 MS Windows platform}
\end{table}

\paragraph{Parameter passing}

\begin{itemize}
\item stack parameter order: right-to-left
\item caller cleans up the stack
\item first 4 integer/pointer parameters are passed via rcx, rdx, r8, r9 (from left to right), others are pushed on stack (there is a
preserve area for the first 4)
\item float and double parameters are passed via xmm0l-xmm3l
\item first 4 parameters are passed via the correct register depending on the parameter type - with mixed float and int parameters,
some registers are left out (e.g. first parameter ends up in rcx or xmm0, second in rdx or xmm1, etc.)
\item parameters in registers are right justified
\item parameters \textless\ 64bits are not zero extended - zero the upper bits contiaining garbage if needed (but they are always
passed as a qword)
\item parameters \textgreater\ 64 bit are passed by reference
\item if callee takes address of a parameter, first 4 parameters must be dumped (to the reserved space on the stack) - for
floating point parameters, value must be stored in integer AND floating point register
\item caller cleans up the stack, not the callee (like cdecl)
\item stack is always 16byte aligned - since return address is 64 bits in size, stacks with an odd number of parameters are
already aligned
\item ellipsis calls take floating point values in int and float registers (single precision floats are promoted to double precision
as defined for ellipsis calls)
\item if size of parameters \textgreater\ 1 page of memory (usually between 4k and 64k), chkstk must be called
\end{itemize}


\paragraph{Return values}

\begin{itemize}
\item return values of pointer or integral type (\textless=\ 64 bits) are returned via the rax register
\item floating point types are returned via the xmm0 register
\item for types \textgreater\ 64 bits, a secret first parameter with an address to the return value is passed
\end{itemize}


\paragraph{Stack layout}

Stack frame is always 16-byte aligned. Stack directly after function prolog:\\

\begin{figure}[h]
\begin{tabular}{5|3|1 1}
\hhline{~-~~}
                                  & \vdots                     &                                &                              \\
\hhline{~=~~}
local data                        &                            &                                & \mrrbrace{9}{caller's frame} \\
\hhline{~-~~}
\mrlbrace{7}{parameter area}      & \ldots                     & \mrrbrace{3}{stack parameters} &                              \\
                                  & \ldots                     &                                &                              \\
                                  & \ldots                     &                                &                              \\
                                  & r9 or xmm3                 & \mrrbrace{4}{spill area}       &                              \\
                                  & r8 or xmm2                 &                                &                              \\
                                  & rdx or xmm1                &                                &                              \\
                                  & rcx or xmm0                &                                &                              \\
\hhline{~-~~}
                                  & return address             &                                &                              \\
\hhline{~=~~}
local data                        &                            &                                & \mrrbrace{3}{current frame}  \\
\hhline{~-~~}
parameter area                    &                            &                                &                              \\
\hhline{~-~~}
                                  & \vdots                     &                                &                              \\
\hhline{~-~~}
\end{tabular}
\caption{Stack layout on x64 Microsoft platform}
\end{figure}



\newpage

\subsubsection{System V (Linux / *BSD / MacOS X)}

\paragraph{Registers and register usage}

\begin{table}[h]
\begin{tabular}{3 B}
\hline
Name                & Brief description\\
\hline
{\bf rax}           & scratch, return value\\
{\bf rbx}           & permanent\\
{\bf rcx}           & scratch, parameter 3 if integer or pointer\\
{\bf rdx}           & scratch, parameter 2 if integer or pointer, return value\\
{\bf rdi}           & scratch, parameter 0 if integer or pointer\\
{\bf rsi}           & scratch, parameter 1 if integer or pointer\\
{\bf rbp}           & permanent, may be used ase frame pointer\\
{\bf rsp}           & stack pointer\\
{\bf r8-r9}         & scratch, parameter 4 and 5 if integer or pointer\\
{\bf r10-r11}       & scratch\\
{\bf r12-r15}       & permanent\\
{\bf xmm0}          & scratch, floating point parameters 0, floating point return value\\
{\bf xmm1-xmm7}     & scratch, floating point parameters 1-7\\
{\bf xmm8-xmm15}    & scratch\\
{\bf st0-st1}       & scratch, 16 byte floating point return value\\
{\bf st2-st7}       & scratch\\
\hline
\end{tabular}
\caption{Register usage on x64 System V (Linux/*BSD)}
\end{table}

\paragraph{Parameter passing}

\begin{itemize}
\item stack parameter order: right-to-left
\item caller cleans up the stack
\item first 6 integer/pointer parameters are passed via rdi, rsi, rdx, rcx, r8, r9
\item first 8 floating point parameters \textless=\ 64 bits are passed via xmm0l-xmm7l
\item parameters in registers are right justified
\item parameters that are not passed via registers are pushed onto the stack
\item parameters \textless\ 64bits are not zero extended - zero the upper bits contiaining garbage if needed (but they are always
passed as a qword)
\item integer/pointer parameters \textgreater\ 64 bit are passed via 2 registers
\item if callee takes address of a parameter, number of used xmm registers is passed silently in al (passed number mustn't be
exact but an upper bound on the number of used xmm registers)
\item stack is always 16byte aligned - since return address is 64 bits in size, stacks with an odd number of parameters are
already aligned
\end{itemize}


\paragraph{Return values}

\begin{itemize}
\item return values of pointer or integral type (\textless=\ 64 bits) are returned via the rax register
\item floating point types are returned via the xmm0 register
\item for types \textgreater\ 64 bits, a secret first parameter with an address to the return value is passed - the passed in address
will be returned in rax
\item floating point values \textgreater\ 64 bits are returned via st0 and st1
\end{itemize}


\paragraph{Stack layout}

Stack frame is always 16-byte aligned. Note that there is no spill area.
Stack directly after function prolog:\\

\begin{figure}[h]
\begin{tabular}{5|3|1 1}
\hhline{~-~~}
                                  & \vdots                     &                                &                              \\
\hhline{~=~~}
local data                        &                            &                                & \mrrbrace{5}{caller's frame} \\
\hhline{~-~~}
\mrlbrace{3}{parameter area}      & \ldots                     & \mrrbrace{3}{stack parameters} &                              \\
                                  & \ldots                     &                                &                              \\
                                  & \ldots                     &                                &                              \\
\hhline{~-~~}
                                  & return address             &                                &                              \\
\hhline{~=~~}
local data                        &                            &                                & \mrrbrace{3}{current frame}  \\
\hhline{~-~~}
parameter area                    &                            &                                &                              \\
\hhline{~-~~}
                                  & \vdots                     &                                &                              \\
\hhline{~-~~}
\end{tabular}
\caption{Stack layout on x64 System V (Linux/*BSD)}
\end{figure}