File: skins_windows.txt

package info (click to toggle)
sleuthkit 2.06-3etch1
  • links: PTS
  • area: main
  • in suites: etch
  • size: 7,128 kB
  • ctags: 5,133
  • sloc: ansic: 41,406; sh: 14,123; perl: 4,745; cpp: 4,297; makefile: 925; python: 29
file content (54 lines) | stat: -rw-r--r-- 2,475 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
                        Windows Implementation
                     Sleuth Kit Implementation Notes
                        http://www.sleuthkit.org

                            Brian Carrier
                       Last Updated: August 2006


INTRODUCTION
=======================================================================
Version 2.06 of The Sleuth Kit included support for Microsoft Windows.  
There were several design changes that needed to occur so that TSK could
run on both Windows and Unix systems.  The biggest change, and the focus 
of this document, was how Unicode and non-English characters were dealt 
with.


PROBLEM 
=======================================================================
Unicode characters can be stored in multiple formats.  Unix systems
use UTF-8, which stores the characters in 1, 2, 3, or 4 bytes. Windows
users UTF-16, which stores characters in 2 or 4 bytes.  Because of
this difference, the input to and output of TSK is different on Windows
versus Unix.


SOLUTION
=======================================================================
The solution to this problem was to create many C #defines that map
a general name to the specific function or type that is used on each
platform.  Internally, all code uses the UTF-8 encoding.  This means
that the input and output may need to be converted on Windows.

The input data consists of image file names, image and file system types,
and addresses.  There is no need to convert the file names because the
native system calls need the same format as the input.  For the image,
volume, and file system types, I assume that they will always be in
English and therefore they are easily converted to ASCII on Windows.
Lastly, addresses in a string form are easy to convert to an integer
and this is done using either UTF-8 or UTF-16 atoi-type functions.

For output, the printf and fprintf functions were wrapped with
TSK-specific versions.  The wrappers will convert the UTF-8 code to
UTF-16, if needed, and then print the resulting data.

Therefore, few changes occurred to the volume and file system code except
that the printf wrappers were used.  The command line tools needed to
be changed to handle the 2-byte TCHAR values as input and to use the T*
functions, which map to either UTF-8 or UTF-16 functions.


-----------------------------------------------------------------------
Copyright (c) 2006 by Brian Carrier.  All Rights Reserved
CVS Date: $Date: 2006/08/31 21:00:00 $