1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<title>IDE Communications Protocol</title>
<link rel="stylesheet" type="text/css" href="main.css">
</head>
<body bgcolor="#ffffff">
<h1>IDE Communications Protocol</h1>
<p>Poly/ML release 5.3 introduces support for an Integrated Development Environment
to extract extra information about a program. This documents the protocol used
to exchange information with the front-end. It is written on top of functions
that extract the information from the compiler's parse tree. Some applications
may find it more convenient to interact directly with these functions and implement
their own protocol. This document is primarily aimed at writers of IDEs or plug-ins
who are interacting with the normal ML top-level.</p>
<h2>Lexical</h2>
<p>The basic format uses a binary XML-like representation in which the escape
character (0x1b) is used as a special marker. It may be followed by other characters
that determine how the remainder of the input is to be treated. Strings are
sent as a sequence of bytes terminated by the escape character. If the escape
character itself appears in the string it is sent as two escape characters,
except within compilation input (see below). Where a value represents a number
it is sent as base ten, possibly preceded by ~ or -.</p>
<h2>Packets and Mark-up</h2>
<p>There are two different ways in which escape combinations may occur. Within
the communications protocol data is exchanged between the IDE and the Poly/ML
front-end using packets of data. These begin and end with an escape sequence
and use escape sequences, usually escape followed by comma, to separate the
elements. The opening escape sequence is always escape followed by an upper
case character and the closing sequence is always escape followed by the lower
case version of the opening sequence. For many cases, the format of the packet
is fixed but there is an exception in the case of marked-up text. Marked-up
text can arise in the case of error messages or some other output from the compiler
where extra information can be inserted at arbitrary point within the text of
the message. Such mark-up uses the same format as the protocol packets but the
opening section is delimited by escape followed by semicolon. Having a standard
format provides for upwards compatibility since an IDE can easily skip mark-up
that it does not recognise.</p>
<h2>Output mark-up</h2>
<p>Poly/ML can be run in a mode where it produces enhanced output but otherwise
runs a normal top-level. This can be used by the IDE to give the user access
to a normal interactive ML session. The <code>--with-markup</code> option to
Poly/ML runs the normal Poly/ML top-level loop but causes it to add mark-up
to some of its messages. Currently it is used in two cases; in error messages
and in messages showing where an identifier was declared.<br>
The format of the information showing a location is:
</p>
<div class="packet">ESC 'D' filename ESC ',' startline ESC ',' startlocation ESC
',' endlocation ESC ';'</div>
This is followed by the identifier itself and then the closing packet:
<div class="packet">ESC 'd'</div>
<p>An error message packet consists of </p>
<div class="packet">ESC 'E' kind ESC ',' filename ESC ',' startline ESC ',' startlocation
ESC ',' endlocation ESC ';'</div>
<p>"kind" is either 'E' indicating a hard error or 'W' indicating a
non-fatal warning. This is followed by the text of the message and then the
closing packet:</p>
<div class="packet">ESC 'e'</div>
<p>Mark up in the future will follow the same pattern allowing the IDE to skip
unrecognised mark-up. This mark-up is also used in some of the packets within
the full IDE protocol.</p>
<h2>Full IDE protocol</h2>
<p> When run with the <code>--ideprotocol</code>
option the top-level loop runs the full IDE communication protocol. This can also be started by <code>PolyML.IDEInterface.runIDEProtocol()</code> from within PolyML.
</p>
<p>This is
intended primarily for compiling files while they are being edited, either as
the result of an explicit request from the user or automatically. When this
option is given the front-end retains the parse tree and requests can be made
to extract information from the parse tree.
</p>
<p>When the IDE mode is started, PolyML sends the following message to std-out:
<div class="packet">ESC 'H' protocol-version-number ESC 'h'</div>
where <code>protocol-version-number</code> is the version number identifying the particular version of the PolyML protocol. Requests to PolyML should wait until this message has been sent by PolyML. The current version of the protocol is <code>1.0.0</code>.
</p>
<p>Requests to PolyML are in terms of byte offsets
within the last source text. If the text has been edited since it was last sent
to ML the IDE must convert positions within the current source text into positions
within the original before sending requests to ML and do the reverse conversion
before displaying the results.</p>
<h3>Requests from IDE to Poly/ML.</h3>
<p>Simple requests about the current parse tree all have the same format. They
contain a request code that describes the kind of information to return and
a pair of positions. Frequently the start and end positions will be the same.
PolyML searches for the smallest node in the parse tree that spans the
positions and returns information about that node. It always retains the actual
span for the node in the result so that the IDE can highlight the actual text
in the display.</p>
<p>Every request contains a request identifier which is returned in the result.
This allows the IDE to run asynchronously. A request identifier is an arbitrary
string generated by the IDE. The request identifier used in a compilation request
has a special status. This identifier is used to mark the version of the parse
tree that results from the compilation and must be included in commands that
query the parse tree. In that way Poly/ML is able to tell whether a request
refers to the current tree or to an older or newer version.</p>
<p>The format of a request packet is:
</p>
<div class="packet">ESC CODE request-id ESC ',' parse-tree-id ESC ',' start-offset ESC ',' end-offset ESC code</div>
The start character is an upper case character, the end character the corresponding lower case character.<br>
The CODE is, currently, one of the following
<table border="0" width="43%">
<tbody>
<tr>
<td width="11%">O</td>
<td width="89%">Return a list of properties for the node</td>
</tr>
<tr>
<td>T</td>
<td>Return type information</td>
</tr>
<tr>
<td>I</td>
<td>Return declaration location</td>
</tr>
<tr>
<td>M</td>
<td>Move relative to a given position</td>
</tr>
<tr>
<td>V</td>
<td>Return a list of references to an identifier</td>
</tr>
</tbody>
</table>
<p>Responses follow a similar structure to the request. The start and end code
for a response is the same as the start and end for the request. All responses
contain the actual start and end points of the current tree. If there is no
parse tree the start and end offsets will be zero. An unrecognised command will
return an empty response for forwards compatibility. Where a command is invalid
or unrecognised the response will be
</p>
<div class="packet">ESC CODE request-id ESC ',' parse-tree-id ESC ',' startoffset ESC ',' endoffset ESC code</div>
<p>In particular, because the IDE may issue requests while a compilation is running
the parse tree id in the request packet may not match the current parse tree
within Poly/ML. In that case the parse tree id in the result packet will contain
the <em>current</em> parse tree and not the parse tree id in the request. The
IDE must keep a list of requests it has sent along with the parse tree id it
used and if it receives a response with a different parse tree id it should
reissue the request adjusting the offsets to account for any changes.</p>
<p><span class="packet-name">O Request:</span>
</p>
<div class="packet">ESC 'O' request-id ESC ',' parse-tree-id ESC ',' start-offset ESC ',' end-offset ESC 'o'</div>
<span class="packet-name">O Response:</span>
<div class="packet">ESC 'O' request-id ESC ',' parse-tree-id ESC ',' start-offset ESC ',' end-offset
commands ESC 'o'</div>
The commands are a set of strings separated by ESC ','. At the moment the strings
are all single characters and identify the set of valid commands and sub-commands:
i.e. T,I,J,S,V,U,C,N and P. For forward compatibility the IDE should accept but
ignore other strings.
<p> <span class="packet-name">T Request:</span>
</p>
<div class="packet">ESC 'T' request-id ESC ',' parse-tree-id ESC ',' start-offset ESC ',' end-offset ESC 't'</div>
<span class="packet-name">T Response:</span>
<div class="packet">ESC 'T' request-id ESC ',' parse-tree-id ESC ',' start-offset ESC ',' end-offset
ESC ',' type-string ESC 't'</div>
Returns the type of the selected node as a string. There is no currently mark
up in the type. If this is not an expression this returns
<div class="packet">ESC 'T' request-id ESC ',' parse-tree-id ESC ',' start-offset ESC ',' end-offset ESC 't'</div>
<p><span class="packet-name">I Request: </span>
</p>
<div class="packet">ESC CHAR 'I' request-id ESC ',' ESC ',' parse-tree-id ESC ',' startoffset ESC ',' endoffset
ESC ',' request-type ESC 'i'</div>
<span class="packet-name">I Response:</span>
<div class="packet">ESC 'I' request-id ESC ',' parse-tree-id ESC ',' start-offset ESC ',' end-offset
ESC ',' filename ESC ',' startline ESC ',' startlocation ESC ',' endlocation ESC 'i'</div>
Thse requests return information about an identifier. Finds an identifier at
a given location and returns the location where that identifier was defined,
if the request-type is 'I'; where the identifier was added to the current name
space using an "open" declaration, if it is J and the location where
the structure that was opened was declared, if it is S. The first two locations
as usual indicate the position of the use of the identifier. The remaining information
relates to the declaration. Returns
<div class="packet">ESC 'I' request-id ESC ',' parse-tree-id ESC ',' startoffset ESC ',' endoffset ESC 'i'</div>
if this is not an identifier or did not come from opening a structure,
where appropriate. The information is similar to a D block in mark-up.
<table border="0" width="43%">
<tbody>
<tr>
<td>I</td>
<td>Return declaration location</td>
</tr>
<tr>
<td>J</td>
<td>Return the location where an identifier was opened</td>
</tr>
<tr>
<td>S</td>
<td>Return the location of an identifier's parent structure</td>
</tr>
</tbody>
</table>
<p><span class="packet-name">M request:</span>
</p>
<div class="packet">ESC CHAR 'M' ESC ',' parse-tree-id ESC ',' startoffset ESC ',' endoffset ESC ',' action ESC 'm'</div>
The action
string defines the direction of the move. Currently all moves are single character
strings.
<table border="0" width="43%">
<tbody>
<tr>
<td>"U"</td>
<td>Move to the parent node</td>
</tr>
<tr>
<td>"C"</td>
<td>Move to the first child node</td>
</tr>
<tr>
<td>"N"</td>
<td>Move to the next sibling</td>
</tr>
<tr>
<td>"P"</td>
<td>Move to the previous sibling</td>
</tr>
</tbody>
</table>
<span class="packet-name">M Response:</span>
<div class="packet">ESC CHAR 'M' parse-tree-id ESC ',' startoffset ESC ',' endoffset ESC 'm'</div>
PolyML selects the parse tree node at the given location and moves relative
to it. If the move is not possible the location remains unchanged. If the move is successful then the new move location is returned in the packet.
<p> <span class="packet-name">V Request:</span>
</p>
<div class="packet">ESC 'V' request-id ESC ',' parse-tree-id ESC ',' start-offset ESC ',' end-offset ESC 'v'</div>
<span class="packet-name">V Response:</span>
<div class="packet">ESC 'V' request-id ESC ',' parse-tree-id ESC ',' start-offset
ESC ',' end-offset ESC ',' start-ref1 ESC ',' end-ref1 ESC ',' .... start-refn
ESC ',' end-refn ESC 'v'</div>
Returns the list of local references to an identifier if the parse tree location
refers to a local value identifier. The results are a sequence of start and end
offsets. If the specified location is not a local value identifier the result
is
<div class="packet">ESC 'V' request-id ESC ',' parse-tree-id ESC ',' start-offset
ESC ',' end-offset ESC 'v'</div>
<h3>Compilation.</h3>
<p>In order to compile a piece of text the IDE sends it to ML through the protocol.
Because any previous compilation may have executed code and affected the global
state it is assumed that the IDE will set up some form of context for the file
by previously saving some state. Typically, this would require it to have compiled
all the files that this particular piece of source text will depend on and to
have saved it in a saved state. A compilation request therefore has the following
structure:
</p>
<div class="packet">ESC 'R' request-id ESC ',' sourcename ESC ',' startposition ESC ',' prelude-length
ESC ',' source-length ESC ',' prelude ESC ',' source-text ESC 'r'</div>
'prelude' is a piece of ML code to compile and execute before the user-supplied
code. Typically this will include a call to PolyML.SaveState.loadState to load
the initial state for the compilation.<br>
'source-text' is the ML code that is being compiled and executed.<br>
'sourcename' is the name to be used as the file name when reporting locations.<br>
'startposition' is the value to be used as the initial offset in the file. The
usual case is that this is zero or a null string which is taken as zero.<br>
'<code>prelude-length</code>' and
'<code>source-length</code>' are
the number of bytes in the prelude and source text respectively. Since the prelude
and source text may be large it is much more efficient to use these lengths
to read the input. Because these lengths are provided it is not necessary to
search for the escape code within the text and so if the escape character appears
within the text it is not itself escaped.
<p>Poly/ML responds with a result block. The format of the result block depends
on the result of the compilation and possible execution of the code. The result block has the form:
</p>
<div class="packet">ESC 'R' request-id ESC ',' parse-tree-id
ESC ',' result ESC ',' finaloffset ESC ';' errors_and_messages ESC 'r'</div>
"<code>result</code>" is a single character indicating success or failure.<br>
"<code>finaloffset</code>" is the byte position that indicates the extent of
the valid parse tree. If there was an error this may be less than the end of
the input. It may be the start of the input if there was a syntax error and
no parse tree could be created. As usual "<code>request-id</code>" is the ID of
the request. The "<code>parse-tree-id</code>" will normally be the same as the
"<code>request-id</code>" indicating that the compilation has updated the parse
tree, even if type checking failed. However, if there was a failure, such as
during parsing, that meant that no new parse tree could be produced the ID returned
will be the original parse tree ID, or the empty string if there was none. <br>
Error packets within the <code>errors_and_messages</code> have the same format as described above for mark-up.
</p>
<p>The result codes are<br>
S - Success. The file compiled successfully and ran without an exception.<br>
X - Exception. The file compiled successfully but raised an exception when it
ran.<br>
L - The prelude code failed to compile or raised an exception.<br>
F - Parse or type checking failure.<br>
C - Cancelled during compilation.<br>
The parse tree will be updated to reflect the result of the compilation and
the current parse tree identifier used by Poly/ML will be set to the identifier
supplied in the request. </p>
<p>For a result code of S (compiled successfully) there may be warnings.</p>
<p> For a result code of L (prelude code failed) the result packet contains the exception packet that
was returned and has the form:<br>
</p>
<div class="packet">ESC 'R' request-id ESC ',' parse-tree-id ESC ',' 'L' ESC ',' startOffset
ESC ';' error_message ESC 'r'</div>
<p>For a result code of F (parsing or type checking failed) the result packet contains a list of one or more error
packets. The format of the result packet is:
</p>
<div class="packet">ESC 'R' request-id ESC ',' parse-tree-id ESC ',' 'F' ESC ',' endOffset
ESC ';' error_packets ESC 'r'</div>
<p>Where the error packets have the same format as described above for mark-up.</p>
<p>For a result code of C (cancel compiled) the result may or may not contain error packets depending
on whether the compilation had produced error messages before the compilation
was cancelled.</p>
<p>For a result code of X the <code>errors_and_messages</code> result packet contains the exception message first within a <code>X</code> tag, as
a string. The string may also containing output mark-up such as the D-style mark-up showing the location
where the exception was raised. Thus the format of the packet with exception data
is: <br>
</p>
<div class="packet">ESC 'R' request-id ESC ',' parse-tree-id ESC ',' 'X' ESC ',' startOffset
ESC ';' ESC 'X' exception_message ESC 'x' error_packets ESC 'r'</div>
<h3>Cancellation.</h3>
<p>Compilation is run as a separate thread and may be cancelled using the K-request.<br>
</p>
<div class="packet">ESC 'K' request-id ESC 'k'</div>
<br>
The request identifier is the identifier used in the R-request that should be
cancelled. Unlike other requests this packet does not have its own identifier
nor does it have a direct response.
<p>The action on receiving a cancel request depends on the current state of the
compilation. If the compilation has already finished no action is taken. Poly/ML
will have already sent a result packet for the compilation. If the compilation
is in progress Poly/ML will attempt to cancel it by sending an interrupt to
the compilation thread to ask it to terminate. If the thread is actually in
the compiler at the time the interrupt is received the result will be a C result
code but if it is actually executing the result of compilation this code will
receive the Interrupt exception. Assuming it does not trap it the result will
be an X result code. The thread may actually have completed before the interrupt
is processed so any other result is also possible.</p>
<p> </p>
</body>
</html>
|