1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645
|
#LyX 1.3 created this file. For more info see http://www.lyx.org/
\lyxformat 221
\textclass article
\begin_preamble
\usepackage{hyperref}
\usepackage{color}
\definecolor{orange}{cmyk}{0,0.4,0.8,0.2}
\definecolor{brown}{cmyk}{0,0.75,0.75,0.35}
% Use and configure listings package for nicely formatted code
\usepackage{listings}
\lstset{
language=Python,
basicstyle=\small\ttfamily,
commentstyle=\ttfamily\color{blue},
stringstyle=\ttfamily\color{brown},
showstringspaces=false,
breaklines=true,
postbreak = \space\dots
}
\end_preamble
\language english
\inputencoding auto
\fontscheme palatino
\graphics default
\paperfontsize 11
\spacing single
\papersize Default
\paperpackage a4
\use_geometry 1
\use_amsmath 0
\use_natbib 0
\use_numerical_citations 0
\paperorientation portrait
\leftmargin 1in
\topmargin 0.9in
\rightmargin 1in
\bottommargin 0.9in
\secnumdepth 3
\tocdepth 3
\paragraph_separation skip
\defskip medskip
\quotes_language english
\quotes_times 2
\papercolumns 1
\papersides 1
\paperpagestyle default
\layout Title
Interactive Notebooks for Python
\newline
\size small
An IPython project for Google's Summer of Code 2005
\layout Author
Fernando P
\begin_inset ERT
status Collapsed
\layout Standard
\backslash
'{e}
\end_inset
rez
\begin_inset Foot
collapsed true
\layout Standard
\family typewriter
\size small
Fernando.Perez@colorado.edu
\end_inset
\layout Abstract
This project aims to develop a file format and interactive support for documents
which can combine Python code with rich text and embedded graphics.
The initial requirements only aim at being able to edit such documents
with a normal programming editor, with final rendering to PDF or HTML being
done by calling an external program.
The editing component would have to be integrated with IPython.
\layout Abstract
This document was written by the IPython developer; it is made available
to students looking for projects of interest and for inclusion in their
application.
\layout Section
Project overview
\layout Standard
Python's interactive interpreter is one of the language's most appealing
features for certain types of usage, yet the basic shell which ships with
the language is very limited.
Over the last few years, IPython
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://ipython.scipy.org}
\end_inset
\end_inset
has become the de facto standard interactive shell in the scientific computing
community, and it enjoys wide popularity with general audiences.
All the major Linux distributions (Fedora Core via Extras, SUSE, Debian)
and OS X (via fink) carry IPython, and Windows users report using it as
a viable system shell.
\layout Standard
However, IPython is currently a command-line only application, based on
the readline library and hence with single-line editing capabilities.
While this kind of usage is sufficient for many contexts, there are usage
cases where integration in a graphical user interface (GUI) is desirable.
\layout Standard
In particular, we wish to have an interface where users can execute Python
code, input regular text (neither code nor comments) and keep inline graphics,
which we will call
\emph on
Python notebooks
\emph default
.
This kind of system is very popular in scientific computing; well known
implementations can be found in Mathematica
\begin_inset ERT
status Collapsed
\layout Standard
\backslash
texttrademark
\end_inset
\SpecialChar ~
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://www.wolfram.com/products/mathematica}
\end_inset
\end_inset
and Maple
\begin_inset ERT
status Collapsed
\layout Standard
\backslash
texttrademark
\end_inset
\SpecialChar ~
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://www.maplesoft.com}
\end_inset
\end_inset
, among others.
However, these are proprietary (and quite expensive) systems aimed at an
audience of mathematicians, scientists and engineers.
\layout Standard
The full-blown implementation of a graphical shell supporting this kind
of work model is probably too ambitious for a summer project.
Simultaneous support for rich-text editing, embedded graphics and syntax-highli
ghted code is extremely complex, and likely to require far more effort than
can be mustered by an individual developer for a short-term project.
\layout Standard
This project will thus aim to build the necessary base infrastructure to
be able to edit such documents from a plain text editor, and to render
them to suitable formats for printing or online distribution, such as HTML,
PDF or PostScript.
This model follows that for the production of LaTeX documents, which can
be edited with any text editor.
\layout Standard
Such documents would be extremely useful for many audiences beyond scientists:
one can use them to produce additionally documented code, to explore a
problem with Python and maintain all relevant information in the same place,
as a way to distribute enhanced Python-based educational materials, etc.
\layout Standard
Demand for such a system exists, as evidenced by repeated requests made
to me by IPython users over the last few years.
Unfortunately IPython is only a spare-time project for me, and I have not
had the time to devote to this, despite being convinced of its long term
value and wide appeal.
\layout Standard
If this project is successful, the infrastructure laid out by it will be
immediately useful for Python users wishing to maintain `literate' programs
which include rich formatting.
In addition, this will open the door for the future development of graphical
shells which can render such documents in real time: this is exactly the
development model successfully followed by the LyX
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://www.lyx.org}
\end_inset
\end_inset
document processing system.
\layout Section
Implementation effort
\layout Subsection
Specific goals
\layout Standard
This is a brief outline of the main points of this project.
The next section provides details on all of them.
The student(s) working on the project would need to:
\layout Enumerate
Make design decisions for the internal file structure to enable valid Python
notebooks.
\layout Enumerate
Implement the rendering library, capable of processing an input notebook
through reST or LaTeX and producing HTML or PDF output, as well as exporting
a `pure-code' Python file stripped of all markup calls.
\layout Enumerate
Study existing programming editor widgets to find the most suitable one
for extending with an IPython connector for interactive execution of the
notebooks.
\layout Subsection
Complexity level
\layout Standard
This project is relatively complicated.
While I will gladly assist the student with design and implementation issues,
it will require a fair amount of thinking in terms of overall library architect
ure.
The actual implementation does not require any sophisticated concepts,
but rather a reasonably broad knowledge of a wide set of topics (markup,
interaction with external programs and libraries, namespace tricks to provide
runtime changes in the effect of the markup calls, etc.)
\layout Standard
While raw novices are welcome to try, I suspect that it may be a bit too
much for them.
Students wanting to apply should keep in mind, if the money is an important
consideration, that Google only gives the $4500 reward upon
\emph on
successful completion
\emph default
of the project.
So don't bite more than you can chew.
Obviously if this doesn't matter, anyone is welcome to participate, since
the project can be a very interesting learning experience, and it will
provide a genuinely useful tool for many.
\layout Section
Technical details
\layout Subsection
The files
\layout Standard
A basic requirement of this project will be that the Python notebooks shall
be valid Python source files, typically with a
\family typewriter
.py
\family default
extension.
A renderer program can be used to process the markup calls in them and
generate output.
If run at a regular command line, these files should execute like normal
Python files.
But when run via a special rendering script, the result should be a properly
formatted file.
Output formats could be PDF or HTML depending on user-supplied options.
\layout Standard
A reST markup mode should be implemented, as reST is already widely used
in the Python community and is a very simple format to write.
The following is a sketch of what such files could look like using reST
markup:
\layout Standard
\begin_inset ERT
status Open
\layout Standard
\backslash
lstinputlisting{nbexample.py}
\end_inset
\layout Standard
Additionally, a LaTeX markup mode should also be implemented.
Here's a mockup example of what code using the LaTeX mode could look like.
\layout Standard
\begin_inset ERT
status Open
\layout Standard
\backslash
lstinputlisting{nbexample_latex.py}
\end_inset
\layout Standard
At this point, it must be noted that the code above is simply a sketch of
these ideas, not a finalized design.
An important part of this project will be to think about what the best
API and structure for this problem should be.
\layout Subsection
From notebooks to PDF, HTML or Python
\layout Standard
Once a clean API for markup has been specified, converters will be written
to take a python source file which uses notebook constructs, and generate
final output in printable formats, such as HTML or PDF.
For example, if
\family typewriter
nbfile.py
\family default
is a python notebook, then
\layout LyX-Code
$ pynb --export=pdf nbfile.py
\layout Standard
should produce
\family typewriter
nbfile.pdf
\family default
, while
\layout LyX-Code
$ pynb --export=html nbfile.py
\layout Standard
would produce an HTML version.
The actual rendering will be done by calling appropriate utilities, such
as the reST toolchain or LaTeX, depending on the markup used by the file.
\layout Standard
Additionally, while the notebooks will be valid Python files, if executed
on their own, all the markup calls will still return their results, which
are not really needed when the file is being treated as pure code.
For this reason, a module to execute these files turning the markup calls
into no-ops should be written.
Using Python 2.4's -m switch, one can then use something like
\layout LyX-Code
$ python -m notebook nbfile.py
\layout Standard
and the notebook file
\family typewriter
nbfile.py
\family default
will be executed without any overhead introduced by the markup (other than
making calls to functions which return immediately).
Finally, an exporter to clean code can be trivially implemented, so that:
\layout LyX-Code
$ pynb --export=python nbfile.py nbcode.py
\layout Standard
would export only the code in
\family typewriter
nbfile.py
\family default
to
\family typewriter
nbcode.py
\family default
, removing the markup completely.
This can be used to generate final production versions of large modules
implemented as notebooks, if one wants to eliminate the markup overhead.
\layout Subsection
The editing environment
\layout Standard
The first and most important part of the project should be the definition
of a clean API and the implementation of the exporter modules as indicated
above.
Ultimately, such files can be developed using any text editor, since they
are nothing more than regular Python code.
\layout Standard
But once these goals are reached, further integration with an editor will
be done, without the need for a full-blown GUI shell.
In fact, already today the (X)Emacs editors can provide for interactive
usage of such files.
Using python-mode in (X)Emacs, one can pass highlighted regions of a file
for execution to an underlying python process, and the results are printed
in the python window.
With recent versions of python-mode, IPython can be used instead of the
plain python interpreter, so that IPython's extended interactive capabilities
become available within (X)Emacs (improved tracebacks, automatic debugger
integration, variable information, easy filesystem access to Python, etc).
\layout Standard
But even with IPython integration, the usage within (X)Emacs is not ideal
for a notebook environment, since the python process buffer is separate
from the python file.
Therefore, the next stage of the project will be to enable tighter integration
between the editing and execution environments.
The basic idea is to provide an easy way to mark regions of the file to
be executed interactively, and to have the output inserted automatically
into the file.
The following listing is a mockup of what the resulting file could look
like
\layout Standard
\begin_inset ERT
status Open
\layout Standard
\backslash
lstinputlisting{nbexample_output.py}
\end_inset
\layout Standard
Basically, the editor will execute
\family typewriter
add(2,3)
\family default
and insert the string representation of the output into the file, so it
can be used for rendering later.
\layout Section
Available resources
\layout Standard
IPython currently has all the necessary infrastructure for code execution,
albeit in a rather messy code base.
Most I/O is already abstracted out, a necessary condition for embedding
in a GUI (since you are not writing to stdout/err but to the GUI's text
area).
\layout Standard
For interaction with an editor, it will be necessary to identify a good
programming editor with a Python-compatible license, which can be extended
to communicate with the underlying IPython engine.
IDLE, the Tk-based IDE which ships with Python, should obviously be considered.
The Scintilla editing component
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://www.scintilla.org}
\end_inset
\end_inset
may also be a viable candidate.
\layout Standard
It will also be interesting to look at the LyX editor
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://www.lyx.org}
\end_inset
\end_inset
, which already offers a Python client
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://wiki.lyx.org/Tools/PyClient}
\end_inset
\end_inset
.
Since LyX has very sophisticated LaTeX support, this is a very interesting
direction to consider for the future (though LyX makes a poor programming
editor).
\layout Section
Support offered to the students
\layout Standard
The IPython project already has an established Open Source infrastructure,
including CVS repositories, a bug tracker and mailing lists.
As the main author and sole maintainer of IPython, I will personally assist
the student(s) funded with architectural and design guidance, preferably
on the public development mailing list.
I expect them to start working by submitting patches until they show, by
the quality of their work, that they can be granted CVS write access.
I expect most actual implementation work to be done by the students, though
I will provide assistance if they need it with a specific technical issue.
\layout Standard
If more than one applicant is accepted to work on this project, there is
more than enough work to be done which can be coordinated between them.
\layout Section
Licensing and copyright
\layout Standard
IPython is licensed under BSD terms, and copyright of all sources rests
with the original authors of the core modules.
Over the years, all external contributions have been small enough patches
that they have been simply folded into the main source tree without additional
copyright attributions, though explicit credit has always been given to
all contributors.
\layout Standard
I expect the students participating in this project to contribute enough
standalone code that they can retain the copyright to it if they so desire,
as long as they accept all their work to be licensed under BSD terms.
\layout Section
Acknowledgements
\layout Standard
I'd like to thank John D.
Hunter, the author of matplotlib
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://matplotlib.sf.net}
\end_inset
\end_inset
, for lengthy discussions which helped clarify much of this project.
In particular, the important decision of embedding the notebook markup
calls in true Python functions instead of specially-tagged strings or comments
was an idea I thank him for pushing hard enough to convince me of using.
\layout Standard
My conversations with Brian Granger, the author of PyXG
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://hammonds.scu.edu/~classes/pyxg.html}
\end_inset
\end_inset
and braid
\begin_inset Foot
collapsed true
\layout Standard
\begin_inset LatexCommand \htmlurl{http://hammonds.scu.edu/~classes/braid.html}
\end_inset
\end_inset
, have also been very useful in clarifying details of the necessary underlying
infrastructure and future evolution of IPython for this kind of system.
\layout Standard
Thank you also to the IPython users who have, in the past, discussed this
topic with me either in private or on the IPython or Scipy lists.
\the_end
|