| 12
 3
 4
 5
 6
 7
 8
 9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 
 | //===---------------------------------------------------------------------===//
// Random ideas for the X86 backend: FP stack related stuff
//===---------------------------------------------------------------------===//
//===---------------------------------------------------------------------===//
Some targets (e.g. athlons) prefer freep to fstp ST(0):
http://gcc.gnu.org/ml/gcc-patches/2004-04/msg00659.html
//===---------------------------------------------------------------------===//
This should use fiadd on chips where it is profitable:
double foo(double P, int *I) { return P+*I; }
We have fiadd patterns now but the followings have the same cost and
complexity. We need a way to specify the later is more profitable.
def FpADD32m  : FpI<(ops RFP:$dst, RFP:$src1, f32mem:$src2), OneArgFPRW,
                    [(set RFP:$dst, (fadd RFP:$src1,
                                     (extloadf64f32 addr:$src2)))]>;
                // ST(0) = ST(0) + [mem32]
def FpIADD32m : FpI<(ops RFP:$dst, RFP:$src1, i32mem:$src2), OneArgFPRW,
                    [(set RFP:$dst, (fadd RFP:$src1,
                                     (X86fild addr:$src2, i32)))]>;
                // ST(0) = ST(0) + [mem32int]
//===---------------------------------------------------------------------===//
The FP stackifier should handle simple permutates to reduce number of shuffle
instructions, e.g. turning:
fld P	->		fld Q
fld Q			fld P
fxch
or:
fxch	->		fucomi
fucomi			jl X
jg X
Ideas:
http://gcc.gnu.org/ml/gcc-patches/2004-11/msg02410.html
//===---------------------------------------------------------------------===//
Add a target specific hook to DAG combiner to handle SINT_TO_FP and
FP_TO_SINT when the source operand is already in memory.
//===---------------------------------------------------------------------===//
Open code rint,floor,ceil,trunc:
http://gcc.gnu.org/ml/gcc-patches/2004-08/msg02006.html
http://gcc.gnu.org/ml/gcc-patches/2004-08/msg02011.html
Opencode the sincos[f] libcall.
//===---------------------------------------------------------------------===//
None of the FPStack instructions are handled in
X86RegisterInfo::foldMemoryOperand, which prevents the spiller from
folding spill code into the instructions.
//===---------------------------------------------------------------------===//
Currently the x86 codegen isn't very good at mixing SSE and FPStack
code:
unsigned int foo(double x) { return x; }
foo:
	subl $20, %esp
	movsd 24(%esp), %xmm0
	movsd %xmm0, 8(%esp)
	fldl 8(%esp)
	fisttpll (%esp)
	movl (%esp), %eax
	addl $20, %esp
	ret
This just requires being smarter when custom expanding fptoui.
//===---------------------------------------------------------------------===//
 |