1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504
|
TODO list for libguestfs
======================================================================
This list contains random ideas and musings on features we could add
to libguestfs in future.
- RWMJ
FUSE API
--------
The API needs more test coverage, particularly lesser-used system
calls.
The big unresolved issue is UID/GID mapping between guest filesystem
IDs and the host. It's not easy to automate this because you need
extra details about the guest itself in order to get to its
UID->username map (eg. /etc/passwd from the guest).
Haskell bindings
----------------
Complete the Haskell bindings (see discussion on haskell-cafe).
PHP bindings
------------
Add bindtests to PHP bindings.
Complete bind tests
-------------------
Complete the bind tests - must test the return values and error cases.
virt-inspector - make libvirt XML
---------------------------------
It should be possible to generate libvirt XML from virt-inspector
data, at least partially. This would be just another output type so:
virt-inspector --libvirt guest.img
Note that recent versions of libvirt/virt-install allow guests to be
imported, so this is not so useful any more.
Ideas for extra commands
------------------------
General glibc / core programs:
chgrp
more mk*temp calls
ext2 properties:
badblocks
debugfs
dumpe2fs
e2image
e2undo
filefrag
mklost+found
SELinux:
chcat
restorecon
ch???
Oddball:
pivot_root
fts(3) / ftw(3)
Other initrd-* commands
-----------------------
Such as:
initrd-extract
initrd-replace
virt-rescue pty
---------------
See:
http://search.cpan.org/~rgiersig/IO-Tty-1.08/Pty.pm
http://www.perlmonks.org/index.pl?node_id=582185
Note that pty requires cooperation inside the C code too (there are
two sides to a pty, and one has to be handled after the fork).
[I tried to implement this in the new C virt-rescue, but it doesn't
work. qemu is implementing its own ptys, and they are broken. Need
to fix qemu.]
Windows-based daemon/appliance
------------------------------
See discussion on list:
https://www.redhat.com/archives/libguestfs/2009-November/msg00165.html
virt-disk-explore
-----------------
For multi-level disk images such as live CDs:
http://rwmj.wordpress.com/2009/07/15/unpack-the-russian-doll-of-a-f11-live-cd/
It's possible with libguestfs to recursively look for anything that
might be a filesystem, mount-{,loop} it and look in those, revealing
anything in a disk image.
However this won't work easily for VM disk images in the disk image.
One would have to download those to the host and launch another
libguestfs instance.
[Not sure this is such a good idea. See also live CD inspection idea below.]
Map filesystems to disk blocks
------------------------------
Map files/filesystems/(any other object) to the actual disk
blocks they occupy.
And vice versa.
Is it even possible?
See also contribs/visualize-alignment/
Integration with host intrusion systems
---------------------------------------
Perfect way to monitor VMs from outside the VM. Look for file
hashes, log events, login/logout etc.
http://www.ossec.net/
http://la-samhna.de/samhain/
http://sourceforge.net/projects/aide/
http://osiris.shmoo.com/
http://sourceforge.net/projects/tripwire/
Freeze/thaw filesystems
-----------------------
Access to these ioctls:
http://git.kernel.org/linus/fcccf502540e3d7
Tips for new users in guestfish
-------------------------------
$ guestfish
Tip: You need to 'add disk.img' or 'alloc disk.img nn' to make a new image.
Type 'notips' to disable tips permanently.
><fs> add mydisk
Tip: You need to type 'run' before you can see into the disk image.
><fs> run
Tip: Use 'list-filesystems' to see what filesystems are available.
><fs> list-filesystems
/dev/vda1
Tip: Use 'mount fs /' to mount a filesystem.
><fs> mount /dev/vda1 /
Tip: Use 'll /' to view the filesystem or ...
><fs> ll /
Could we make guestfish interactive if commands are used without params?
------------------------------------------------------------------------
><fs> sparse
[[Prints man page]]
Image name? disk.img
Size of image? 10M
Better support for encrypted devices
------------------------------------
Currently LUKS support only works if the device contains volume
groups. If it contains, eg., partitions, you cannot access them.
We would like to add:
- Direct access to the /dev/mapper device (eg. if it contains
anything apart from VGs).
Display image as PS
-------------------
Display the structure of an image file as a PS.
Greater use of blkid / libblkid
-------------------------------
There are various useful functions in libblkid for listing partitions,
devices etc which we are essentially duplicating in the daemon. It
would make more sense to just use libblkid for this.
There are some places where we call out to the 'blkid' program. This
might be replaced by direct use of the library (if this is easier).
But it is very hard to be compatible between RHEL6 and RHEL5 when
using direct library.
Visualization
-------------
Eric Sandeen pointed out the blktrace tool which is a better way of
capturing traces than using patched qemu (see
contrib/visualize-alignment). We would still use the same
visualization tools in conjunction with blktrace traces.
guestfish parsing
-----------------
At the moment guestfish uses an ad hoc parser which has many
shortcomings. We should change to using a lex/yacc-based scanner and
parser (there are better parsers out there, but yacc is sufficient and
very widely available).
The scanner must deal with the case of parsing a whole command string,
eg. for a command that the user types in:
><fs> add-drive-opts "/tmp/foo" readonly:true
and also with parsing single words from the command line:
guestfish add-drive-opts /tmp/foo readonly:true
Note the quotes are for scanning and don't indicate types.
We should also allow variables and expressions as part of this new
parsing code, eg:
set roots inspect-os
set product inspect-get-product-name %{roots[0]}
% is better than $ because of shell escaping and confusion with shell
variables.
Can we combine this with ability to set and read environment
variables? Currently guestfish uses many environment variables like
$EDITOR without any corresponding ability to set them.
set EDITOR /usr/bin/emacs
echo $EDITOR # or %{EDITOR}
edit /etc/resolv.conf
More ntfs tools
---------------
ntfsprogs actually has a lot more useful tools than we currently
use. Interesting ones are:
ntfscluster: display file(s) that occupy a cluster or sector
ntfsinfo: print various information about NTFS volume and files
ntfs streams: extract alternate streams from NTFS files
ntfsck: checker for NTFS filesystems
Undelete files
--------------
Two useful tools:
- ext2undelete
- ntfsundelete
More mkfs_opts options
----------------------
Useful options to offer:
- Set label.
- Set UUID.
Use /proc/self/mountinfo
------------------------
This file contains lots of interesting information about
what is mounted and where. eg:
16 21 0:3 / /proc rw,relatime - proc /proc rw
17 21 0:16 / /sys rw,relatime - sysfs /sys rw,seclabel
18 23 0:5 / /dev rw,relatime - devtmpfs udev rw,seclabel,size=1906740k,nr_inodes=476685,mode=755
26 21 253:3 / /home rw,relatime - ext4 /dev/mapper/vg-lv_home rw,seclabel,barrier=1,data=ordered
This could be used instead of current hairy code to parse the output
of the 'mount' command. We could add new APIs to return kernel mount
options, type of filesystem at a mountpoint etc.
guestfish drive letters
-----------------------
There should be an option to mount all Windows drives as separate
paths, like C: => /c/, D: => /d/ etc.
More inspection features
------------------------
- last shutdown time
- DHCP address
- last time the software was updated
- last user who logged in
- lastlog, last, who
Integrate virt-inspector with CMDBs
-----------------------------------
Either integrate virt-inspector with Configuration Management
Databases (CMDBs) or at least check that virt-inspector produces the
right range of data so that integration would be possible. The
standards for CMDBs come from the DMTF, see eg:
http://dmtf.org/news/pr/2009/7/dmtf-releases-cmdbf-standard-federating-configuration-management-data
Efficient way to visit all files
--------------------------------
https://rwmj.wordpress.com/2010/12/15/tip-audit-virtual-machine-for-setuid-files/#content
A naive method would look like:
g#visit ~return_stats:true "/" (
fun pathname stat ->
...
)
However this has two disadvantages:
- requires hand-written custom bindings in each language
- unclear about locking, thread-safety and re-entrancy of handle g
A better way would be to have some sort of explicit "download all
filenames and stat structures", which could then be iterated over:
let files = g#find_opts ~return_stats:true "/" in
List.iter (
fun pathname stat ->
...
)
The problem with this is that 'files' is going to be larger than a
protocol buffer.
This leads to thinking about changes to the protocol / generator to
make this simpler. The proposal would be to add RBigStringList,
RBigStructList [or RBig (Ranytype ...)]. These would work like
FileOut, in that they would use file streaming to stream XDR
structures (probably written to a file on the library side).
Generated code would hide most of the implementation.
We also need to think about security issues: is it possible for the
daemon to keep sending back data forever, and if so what happens on
the library side.
[Users can now use virt-ls to solve some of these problems, but it is
not a general solution at the API level]
Interactive disk creator
------------------------
An interactive disk creator program.
Attach method for disconnected operation
----------------------------------------
http://libguestfs.org/guestfs.3.html#guestfs_set_attach_method
"Librarian" has an idea that he should be able to attach to a regular
appliance, but disconnect from it and reconnect to it later. This
would be some sort of modified attach method (see link above).
The complexity here is that we would no longer have access to
stdin/stdout (or we'd have to direct that somewhere else).
libosinfo mappings for virt-inspector
-------------------------------------
Return libosinfo mappings from inspection API.
virt-sysprep ideas
------------------
- other Spacewalk / RHN IDs (?)
- Kerberos keys
- Puppet registration
- Windows sysprep
(see: https://github.com/clalancette/oz/blob/e74ce83283d468fd987583d6837b441608e5f8f0/oz/Windows.py )
- (librarian suggests ...)
. install a firstboot script virt-sysprep --script=/tmp/foo.sh
. run an external shell script
. run external guestfish script virt-sysprep --fish=/tmp/foo.fish
- /var/run/* and pam_faillock's data files
- if drives are encrypted, then dm-crypt key should be changed
and drives all re-encrypted
- /etc/pki
(Steve says ...)
Rpm uses nss. Nss sets up its crypto database in
/etc/pki. Depending on how long the machine ran before cloning, you
may have picked up some certificates or things. This is an area
that you would want to look into.
- secure erase of inodes etc using scrub (Steve Grubb)
- other directories that could require cleaning include:
/var/run/*
(thanks Marko Myllynen, James Antill)
- remove or modify UUIDs in /etc/fstab (eg. on Ubuntu)
(thanks Joshua Daniel Franklin)
As well as 'virt-sysprep' there is a case for a 'virt-customize' tool
which can customize templated guests. This would be useful within
companies/organizations that want to offer multiple guests, but all
customized with the organization logo etc. Some ideas:
- change the background image to some custom desktop
- change the sign-on messages (/etc/issue.net etc)
- firstboot script (as suggested by librarian above)
- Windows login script/service
Launch remote sessions over ssh
-------------------------------
We had an idea you could add a launch method that uses ssh, ie. all
febootstrap and qemu commands happen the same as now, but prefixed by
ssh so it happens on a remote machine.
Note that proper remote support and integration with libvirt is
different from this, and people are working on that. ssh would just
be "remote-lite".
virt-make-fs and virt-win-reg need to not be in Perl
----------------------------------------------------
Probably they should be in C or OCaml.
Integrate snap-type functionality in inspection tools
-----------------------------------------------------
Mo Morsi's "snap" program lets you describe a guest as the list of
packages (eg. RPMs) installed + changes made to those RPMs + files
added.
http://projects.morsi.org/wiki/Snap
This results in a compact description of the guest. He even managed
to do a kind of migration of guests by simply recreating the guest
from the description on the target machine.
It would be ideal to integrate this and/or use inspection to do this.
Ongoing code cleanups
---------------------
Examine every use of 'int' in C code for signed overflow problems.
All file descriptors in the library and daemon should normally be
opened with O_CLOEXEC. Therefore we need to examine every call to:
- open, openat
- creat
- pipe (see also: pipe2)
- dup, dup2 (see also: dup3)
- socket, socketpair
- accept (see also: accept4)
- signalfd, timerfd, epoll_create
virt-sparsify enhancements
--------------------------
TMPDIR should be checked to ensure that we won't run out of space
during the conversion, since current behaviour is very bad when this
happens (it usually causes virt-sparsify to hang). This requires
writing a small C binding to statvfs for OCaml.
'virt-sparsify --whitelist' option to generate skeletons (for
debugging, bug forensics, diagnosis). The whilelist option would
specify a list of files to be *preserved*. All other files in the
image would be replaced by equivalent files of zeroes, thus minimizing
the size of the debug image that needs to be shipped to us by the
customer.
Optimize the appliance
----------------------
Pass -cpu host. Anything else?
Sort out partitioning
---------------------
Ignoring some legacy APIs, we currently have a mixed selection of
'part-*' APIs, implemented using parted. We don't like parted or
libparted very much, and would love to replace it with something else.
The part-* APIs are quirky, but not too bad and we should maintain and
extend them instead of making another set of APIs.
One option is to write "libmbr" and "libgpt" libraries that would just
do MBR and GPT respectively, and do it directly and do it well. They
wouldn't try to abstract anything (so, unlike libparted). We could
then reimplement the part-* APIs on top of these hopefully sensible
libraries. This is a lot of work.
Another option is to look for tools or libraries to replace parted.
For GPT there is a fairly obvious candidate: Rod Smith's GPT fdisk
(http://www.rodsbooks.com/gdisk/). Rod has spent a lot of time
studying GPT, and seems to know more about it than any sane man
should. There is a command line tool designed for scripts called
'sgdisk'. The tools are packaged for many Linux distros. Even if
this approach works, it doesn't solve the MBR problem, so likely we'd
have to write a library for that (or perhaps go back to sfdisk but
using a very abstracted interface over sfdisk).
|