1 |
6 |
jlechner |
; Application independent decoder support.
|
2 |
|
|
; Copyright (C) 2000, 2004, 2009 Red Hat, Inc.
|
3 |
|
|
; This file is part of CGEN.
|
4 |
|
|
;
|
5 |
|
|
; This file provides utilities for building instruction set decoders.
|
6 |
|
|
; At present its rather limited, and is geared towards the simulator
|
7 |
|
|
; where the goal is hyper-efficiency [not that there isn't room for much
|
8 |
|
|
; improvement, but rather that that's what the current focus is].
|
9 |
|
|
;
|
10 |
|
|
; The CPU description file provides the first pass's bit mask with the
|
11 |
|
|
; `decode-assist' spec. This gives the decoder a head start on how to
|
12 |
|
|
; efficiently decode the instruction set. The rest of the decoder is
|
13 |
|
|
; determined algorithmically.
|
14 |
|
|
; ??? Need to say more here.
|
15 |
|
|
;
|
16 |
|
|
; The main entry point is decode-build-table.
|
17 |
|
|
;
|
18 |
|
|
; Main procedure call tree:
|
19 |
|
|
; decode-build-table
|
20 |
|
|
; -build-slots
|
21 |
|
|
; -build-decode-table-guts
|
22 |
|
|
; -build-decode-table-entry
|
23 |
|
|
; -build-slots
|
24 |
|
|
; -build-decode-table-guts
|
25 |
|
|
;
|
26 |
|
|
; -build-slots/-build-decode-table-guts are recursively called to construct a
|
27 |
|
|
; tree of "table-guts" elements, and then the application recurses on the
|
28 |
|
|
; result. For example see sim-decode.scm.
|
29 |
|
|
;
|
30 |
|
|
; FIXME: Don't create more than 3 shifts (i.e. no more than 3 groups).
|
31 |
|
|
; FIXME: Exits when insns are unambiguously determined, even if there are more
|
32 |
|
|
; opcode bits to examine.
|
33 |
|
|
|
34 |
|
|
; The set of instruction is internally recorded as a tree of two data
|
35 |
|
|
; structures: "table-guts" and "table-entry".
|
36 |
|
|
; [The choice of "table-guts" is historical, a better name will come to mind
|
37 |
|
|
; eventually.]
|
38 |
|
|
; Decoded tables data structure, termed "table guts".
|
39 |
|
|
|
40 |
|
|
; bitnums: list of bits that have been used thus far to decode the insn
|
41 |
|
|
; startbit: bit offset in instruction of value in C local variable `insn'
|
42 |
|
|
; bitsize: size of value in C local variable `insn', the number
|
43 |
|
|
; of bits of the instruction read thus far
|
44 |
|
|
; entries: list of insns that match the decoding thus far,
|
45 |
|
|
; each entry in the list is a `dtable-entry' record
|
46 |
|
|
; Accessors.
|
47 |
|
|
; A decoded subtable.
|
48 |
|
|
|
49 |
|
|
; key: name to distinguish this subtable from others, used for lookup
|
50 |
|
|
; table: a table-guts element
|
51 |
|
|
; name: name of C variable containing the table
|
52 |
|
|
|
53 |
|
|
; The implementation uses a list so the lookup can use assv.
|
54 |
|
|
; Accessors.
|
55 |
|
|
; List of decode subtables.
|
56 |
|
|
; Add SUBTABLE-GUTS to the subtables list if not already present.
|
57 |
|
|
; Result is the subtable entry already present, or new entry.
|
58 |
|
|
|
59 |
|
|
" "" "" "" "" ""bad dtable entry type:"; An instruction and predicate for final matching.
|
60 |
|
|
; Accessors.
|
61 |
|
|
; Return a pseudo-cost of processing exprentry X.
|
62 |
|
|
; Sort an exprtable, optimum choices first.
|
63 |
|
|
; Basically an optimum choice is a cheaper choice.
|
64 |
|
|
; Return the name of the expr table for INSN-EXPRS,
|
65 |
|
|
; which is a list of exprtable-entry elements.
|
66 |
|
|
|
67 |
|
|
; Typically the expressions are ifield-assertion specs.
|
68 |
|
|
; INSN-EXPRS is a sorted list of exprtable-entry elements.
|
69 |
|
|
; The list is considered sorted in the sense that the first insn to satisfy
|
70 |
|
|
|
71 |
|
|
; Accessors.
|
72 |
|
|
; Decoded table entry data structure.
|
73 |
|
|
; A simple data structure of 3 elements:
|
74 |
|
|
; index: index in the parent table
|
75 |
|
|
|
76 |
|
|
; value: the insn or subtable or exprtable
|
77 |
|
|
; Accessors.
|
78 |
|
|
|
79 |
|
|
; MASKS is a list of opcode masks.
|
80 |
|
|
|
81 |
|
|
; BITNUM is the number of the bit to test. It's value depends on LSB0?.
|
82 |
|
|
; It can be no larger than the smallest element in MASKS.
|
83 |
|
|
; E.g. If MASK-LENS consists of 16 and 32 and LSB0? is #f, BITNUM must
|
84 |
|
|
|
85 |
|
|
; FIXME: This isn't quite right. What if LSB0? = #t? Need decode-bitsize.
|
86 |
|
|
; LSB0? is non-#f if bit number 0 is the least significant bit.
|
87 |
|
|
;
|
88 |
|
|
; FIXME: This is just a first cut, but the governing intent is to not require
|
89 |
|
|
; targets to specify decode tables, hints, or algorithms.
|
90 |
|
|
; Certainly as it becomes useful they can supply such information.
|
91 |
|
|
; The point is to avoid having to as much as possible.
|
92 |
|
|
;
|
93 |
|
|
; FIXME: Bit numbers shouldn't be considered in isolation.
|
94 |
|
|
; It would be better to compute use counts of all of them and then see
|
95 |
|
|
; if there's a cluster of high use counts.
|
96 |
|
|
; If half or more insns use the bit, it's a good one.
|
97 |
|
|
; FIXME: An empirical guess at best.
|
98 |
|
|
; Compute population counts for each bit. Return it as a vector indexed by bit
|
99 |
|
|
; number. Rather than computing raw popularity, attempt to compute
|
100 |
|
|
; "disinguishing value" or inverse-entropy for each bit. The idea is that the
|
101 |
|
|
; larger the number for any particular bit slot, the more instructions it can
|
102 |
|
|
; be used to distinguish. Raw mask popularity is not enough -- popular masks
|
103 |
|
|
; may include useless "reserved" fields whose values don't change, and thus are
|
104 |
|
|
; useless in distinguishing.
|
105 |
|
|
;
|
106 |
|
|
; NOTE: mask-lens are not necessarily all the same value.
|
107 |
|
|
; E.g. for the m32r it can consist of both 16 and 32.
|
108 |
|
|
; But all masks must exist in the window specified by STARTBIT,DECODE-BITSIZE,
|
109 |
|
|
; and all bits in the result must live in that window.
|
110 |
|
|
|
111 |
|
|
; Compute the 1- and 0-population vectors
|
112 |
|
|
|
113 |
|
|
; Compute an aggregate "distinguishing value" for each bit.
|
114 |
|
|
"/"" "; The most useful bits for decoding are those with counts in both
|
115 |
|
|
; p0 and p1. These are the bits which distinguish one insn from
|
116 |
|
|
|
117 |
|
|
;
|
118 |
|
|
|
119 |
|
|
; or p1. These bits represent specializations of other insns.
|
120 |
|
|
; Assign these bits a value between 0 and (num-insns - 1). Note that
|
121 |
|
|
; p0 + p1 is guaranteed to be <= num-insns. The value 0 is assigned
|
122 |
|
|
|
123 |
|
|
; which are always 1 or always 0 in the ISA and are useless for
|
124 |
|
|
|
125 |
|
|
;
|
126 |
|
|
; Bits with no count in either p0 or p1 are useless for decoding
|
127 |
|
|
; and should never be considered. Assigning these bits a value of
|
128 |
|
|
; 0 ensures this.
|
129 |
|
|
; Return a list (0 ... LIMIT-1).
|
130 |
|
|
; Return a list (BASE ... BASE+SIZE-1).
|
131 |
|
|
|
132 |
|
|
; to VALUE.
|
133 |
|
|
; Return a list of indices whose counts in the given vector exceed the given
|
134 |
|
|
|
135 |
|
|
; Sort them in decreasing order of popularity.
|
136 |
|
|
; Return the top few most popular indices in the population vector,
|
137 |
|
|
; ignoring any that are already used (marked by #f). Don't exceed
|
138 |
|
|
; `size' unless the clustering is just too good to pass up.
|
139 |
|
|
"-population-top-few"" desired="" picks=("") pop=("")"" threshold="" new-picks=("")\n"; No point picking bits with population count of zero. This leads to
|
140 |
|
|
; the generation of layers of subtables which resolve nothing. Generating
|
141 |
|
|
; these tables can slow the build by several orders of magnitude.
|
142 |
|
|
|
143 |
|
|
"-population-top-few: No bits left to pick from!\n"; Way too many matches?
|
144 |
|
|
; prefer old-picks
|
145 |
|
|
|
146 |
|
|
; Not enough? Lower the threshold a bit and try to add some more.
|
147 |
|
|
; Notice magic clustering decay parameter
|
148 |
|
|
; vvvv
|
149 |
|
|
; Given list of insns, return list of bit numbers of constant bits in opcode
|
150 |
|
|
; that they all share (or mostly share), up to MAX elements.
|
151 |
|
|
; ALREADY-USED is a list of bitnums we can't use.
|
152 |
|
|
; STARTBIT is the bit offset of the instruction value that C variable `insn'
|
153 |
|
|
|
154 |
|
|
; DECODE-BITSIZE is the number of bits of the insn that `insn' holds.
|
155 |
|
|
; LSB0? is non-#f if bit number 0 is the least significant bit.
|
156 |
|
|
;
|
157 |
|
|
; Nil is returned if there are none, meaning that there is an ambiguity in
|
158 |
|
|
; the specification up to the current word as defined by startbit,
|
159 |
|
|
|
160 |
|
|
;
|
161 |
|
|
; We assume INSN-LIST matches all opcode bits before STARTBIT (if any).
|
162 |
|
|
; FIXME: Revisit, as a more optimal decoder is sometimes achieved by doing
|
163 |
|
|
|
164 |
|
|
; back to earlier ones.
|
165 |
|
|
|
166 |
|
|
; All insns are assumed to start at the same address so we handle insns of
|
167 |
|
|
; varying lengths - we only analyze the common bits in all of them.
|
168 |
|
|
|
169 |
|
|
; Note that if we get called again to compute further opcode bits, we
|
170 |
|
|
; start looking at STARTBIT again (rather than keeping track of how far in
|
171 |
|
|
; the insn word we've progressed). We could do this as an optimization, but
|
172 |
|
|
; we also have to handle the case where the initial set of decode bits misses
|
173 |
|
|
; some and thus we have to go back and look at them. It may also turn out
|
174 |
|
|
|
175 |
|
|
; information to the decoding process (see -usable-decode-bit?). As the
|
176 |
|
|
; possible insn list gets wittled down, the bit will become significant. Thus
|
177 |
|
|
; the optimization is left for later.
|
178 |
|
|
; Also, see preceding FIXME: We can't proceed past startbit + decode-bitsize
|
179 |
|
|
|
180 |
|
|
;; (undecoded (if lsb0?
|
181 |
|
|
;; (-range2 startbit (+ startbit decode-bitsize))
|
182 |
|
|
;; (-range2 (- startbit decode-bitsize) startbit)))
|
183 |
|
|
; (append already-used undecoded))
|
184 |
|
|
|
185 |
|
|
; than the base insn size.
|
186 |
|
|
; FIXME: for now (gets sparc port going)
|
187 |
|
|
; Return list of decode table entry numbers for INSN's opcode bits BITNUMS.
|
188 |
|
|
; This is the indices into the decode table that match the instruction.
|
189 |
|
|
; LSB0? is non-#f if bit number 0 is the least significant bit.
|
190 |
|
|
;
|
191 |
|
|
; Example: If BITNUMS is (0 1 2 3 4 5), and the constant (i.e. opcode) part of
|
192 |
|
|
; the those bits of INSN is #b1100xx (where 'x' indicates a non-constant
|
193 |
|
|
; part), then the result is (#b110000 #b110001 #b110010 #b110011).
|
194 |
|
|
;(display (list val insn-len decode-len bl)) (newline)
|
195 |
|
|
; Oh My God. This isn't tail recursive.
|
196 |
|
|
"insn ="" insn-value="" insn-base-mask="" insn-len="" decode-len="" opcode="" opcode-mask="" indices=""\n"; Subroutine of -build-slots.
|
197 |
|
|
; Fill slot in INSN-VEC that INSN goes into.
|
198 |
|
|
; BITNUMS is the list of opcode bits.
|
199 |
|
|
; LSB0? is non-#f if bit number 0 is the least significant bit.
|
200 |
|
|
;
|
201 |
|
|
; Example: If BITNUMS is (0 1 2 3 4 5) and the constant (i.e. opcode) part of
|
202 |
|
|
; the first six bits of INSN is #b1100xx (where 'x' indicates a non-constant
|
203 |
|
|
|
204 |
|
|
; Each "slot" is a list of matching instructions.
|
205 |
|
|
;(display (string-append "fill-slot!: " (obj:str-name insn) " ")) (display bitnums) (newline)
|
206 |
|
|
;(display (list "Filling slot(s)" slot-nums "...")) (newline)
|
207 |
|
|
; Given a list of constant bitnums (ones that are predominantly, though perhaps
|
208 |
|
|
; not always, in the opcode), record each insn in INSN-LIST in the proper slot.
|
209 |
|
|
; LSB0? is non-#f if bit number 0 is the least significant bit.
|
210 |
|
|
; The result is a vector of insn lists. Each slot is a list of insns
|
211 |
|
|
; that go in that slot.
|
212 |
|
|
; Loop over each element, filling RESULT.
|
213 |
|
|
; Compute the name of a decode table, prefixed with PREFIX.
|
214 |
|
|
|
215 |
|
|
; in reverse order of traversal (since they're built with cons).
|
216 |
|
|
; INDEX-LIST may be empty.
|
217 |
|
|
"table""_"; CDR of each element is the table index.
|
218 |
|
|
; Generate one decode table entry for INSN-VEC at INDEX.
|
219 |
|
|
; INSN-VEC is a vector of slots where each slot is a list of instructions that
|
220 |
|
|
; map to that slot (opcode value). If a slot is nil, no insn has that opcode
|
221 |
|
|
; value so the decoder marks it as being invalid.
|
222 |
|
|
; STARTBIT is the bit offset of the instruction value that C variable `insn'
|
223 |
|
|
; holds (note that this is independent of LSB0?).
|
224 |
|
|
; DECODE-BITSIZE is the number of bits of the insn that `insn' holds.
|
225 |
|
|
; INDEX-LIST is a list of pairs: list of bitnums, table entry number.
|
226 |
|
|
; LSB0? is non-#f if bit number 0 is the least significant bit.
|
227 |
|
|
; INVALID-INSN is an <insn> object to use for invalid insns.
|
228 |
|
|
|
229 |
|
|
; ??? For debugging.
|
230 |
|
|
"Processing decode entry "" in ""decode_"", ""invalid""subtable"" ...\n"; If no insns map to this value, mark it as invalid.
|
231 |
|
|
; If only one insn maps to this value, that's it for this insn.
|
232 |
|
|
; FIXME: Incomplete: need to check further opcode bits.
|
233 |
|
|
; Otherwise more than one insn maps to this value and we need to look at
|
234 |
|
|
; further opcode bits.
|
235 |
|
|
"Building subtable at index "", decode-bitsize = "", indices used thus far:"" ""\n"; If bitnums is nil, either there is an ambiguity or we need to read
|
236 |
|
|
; more of the instruction in order to distinguish insns in SLOT.
|
237 |
|
|
; We might be able to resolve the ambiguity by reading more bits.
|
238 |
|
|
; We know from the < test that there are, indeed, more bits to
|
239 |
|
|
; be read.
|
240 |
|
|
; FIXME: It's technically possible that the next
|
241 |
|
|
; startbit+decode-bitsize chunk has no usable bits and we have to
|
242 |
|
|
; iterate, but rather unlikely.
|
243 |
|
|
; The calculation of the new startbit, decode-bitsize will
|
244 |
|
|
; undoubtedly need refinement.
|
245 |
|
|
;nil ; FIXME: what to put here?
|
246 |
|
|
; If bitnums is still nil there is an ambiguity.
|
247 |
|
|
; Try filtering out insns which are more general cases of
|
248 |
|
|
; other insns in the slot. The filtered insns will appear
|
249 |
|
|
; in other slots as appropriate.
|
250 |
|
|
; Only 1 insn left in the slot, so take it.
|
251 |
|
|
; There is still more than one insn in 'slot',
|
252 |
|
|
; so there is still an ambiguity.
|
253 |
|
|
; If all insns are marked as DECODE-SPLIT, don't warn.
|
254 |
|
|
"WARNING: Decoder ambiguity detected: "; drop leading comma
|
255 |
|
|
", ""\n"; Things aren't entirely hopeless. We've warned about
|
256 |
|
|
; the ambiguity. Now, if there are any identical insns,
|
257 |
|
|
; filter them out. If only one remains, then use it.
|
258 |
|
|
; Only 1 insn left in the slot, so take it.
|
259 |
|
|
; Otherwise, see if any ifield-assertion
|
260 |
|
|
; specs are present.
|
261 |
|
|
; FIXME: For now we assume that if they all have an
|
262 |
|
|
; ifield-assertion spec, then there is no ambiguity (it's left
|
263 |
|
|
; to the programmer to get it right). This can be made more
|
264 |
|
|
; clever later.
|
265 |
|
|
; FIXME: May need to back up startbit if we've tried to read
|
266 |
|
|
; more of the instruction. We currently require that
|
267 |
|
|
; all bits get used before advancing startbit, so this
|
268 |
|
|
; shouldn't be necessary. Verify.
|
269 |
|
|
; Save arguments for debugging purposes.
|
270 |
|
|
"Unable to resolve ambiguity (maybe need some ifield-assertion specs?)"; FIXME: Punt on even simple cleverness for now.
|
271 |
|
|
; There is no ambiguity so generate the subtable.
|
272 |
|
|
|
273 |
|
|
; may be appending to -decode-subtables recursively.
|
274 |
|
|
|
275 |
|
|
; as a list of 3 elements: bitnums, decode-bitsize, and list of entries.
|
276 |
|
|
; Bitnums is recorded with the guts so that tables whose contents are
|
277 |
|
|
; identical but are accessed by different bitnums are treated as separate in
|
278 |
|
|
; -decode-subtables. Not sure this will ever happen, but play it safe.
|
279 |
|
|
;
|
280 |
|
|
; BITNUMS is the list of bit numbers used to build the slot table.
|
281 |
|
|
; STARTBIT is the bit offset of the instruction value that C variable `insn'
|
282 |
|
|
|
283 |
|
|
; For example, it is initially zero. If DECODE-BITSIZE is 16 and after
|
284 |
|
|
|
285 |
|
|
; needed, another piece will be fetched and STARTBIT will then be 16.
|
286 |
|
|
; DECODE-BITSIZE is the number of bits of the insn that `insn' holds.
|
287 |
|
|
; INDEX-LIST is a list of pairs: list of bitnums, table entry number.
|
288 |
|
|
; Decode tables consist of entries of two types: actual insns and
|
289 |
|
|
; pointers to other tables.
|
290 |
|
|
; LSB0? is non-#f if bit number 0 is the least significant bit.
|
291 |
|
|
; INVALID-INSN is an <insn> object representing invalid insns.
|
292 |
|
|
|
293 |
|
|
; Return a table that efficiently decodes INSN-LIST.
|
294 |
|
|
; BITNUMS is the set of bits to initially key off of.
|
295 |
|
|
|
296 |
|
|
; LSB0? is non-#f if bit number 0 is the least significant bit.
|
297 |
|
|
; INVALID-INSN is an <insn> object representing the `invalid' insn (for
|
298 |
|
|
; instructions values that don't decode to any entry in INSN-LIST).
|
299 |
|
|
; Initialize the list of subtables computed.
|
300 |
|
|
; ??? Another way to handle simple forms of ifield-assertions (like those
|
301 |
|
|
; created by insn specialization) is to record a copy of the insn for each
|
302 |
|
|
; possible value of the ifield and modify its ifield list with the ifield's
|
303 |
|
|
; value. This would then let the decoder table builder handle it normally.
|
304 |
|
|
; I wouldn't create N insns, but would rather create an intermediary record
|
305 |
|
|
|
306 |
|
|
; ifield-assertions).
|
307 |
|
|
|