File: the_implementation_of_standard_i_o.rst

package info (click to toggle)
gcc-12-doc 12.2.0-1
  • links: PTS, VCS
  • area: non-free
  • in suites: bookworm, forky, sid, trixie
  • size: 26,004 kB
  • sloc: perl: 479; python: 301; makefile: 239; cpp: 17
file content (1251 lines) | stat: -rw-r--r-- 44,540 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
.. _The_Implementation_of_Standard_I/O:

**********************************
The Implementation of Standard I/O
**********************************

GNAT implements all the required input-output facilities described in
A.6 through A.14.  These sections of the Ada Reference Manual describe the
required behavior of these packages from the Ada point of view, and if
you are writing a portable Ada program that does not need to know the
exact manner in which Ada maps to the outside world when it comes to
reading or writing external files, then you do not need to read this
chapter.  As long as your files are all regular files (not pipes or
devices), and as long as you write and read the files only from Ada, the
description in the Ada Reference Manual is sufficient.

However, if you want to do input-output to pipes or other devices, such
as the keyboard or screen, or if the files you are dealing with are
either generated by some other language, or to be read by some other
language, then you need to know more about the details of how the GNAT
implementation of these input-output facilities behaves.

In this chapter we give a detailed description of exactly how GNAT
interfaces to the file system.  As always, the sources of the system are
available to you for answering questions at an even more detailed level,
but for most purposes the information in this chapter will suffice.

Another reason that you may need to know more about how input-output is
implemented arises when you have a program written in mixed languages
where, for example, files are shared between the C and Ada sections of
the same program.  GNAT provides some additional facilities, in the form
of additional child library packages, that facilitate this sharing, and
these additional facilities are also described in this chapter.

.. _Standard_I/O_Packages:

Standard I/O Packages
=====================

The Standard I/O packages described in Annex A for

*
  Ada.Text_IO
*
  Ada.Text_IO.Complex_IO
*
  Ada.Text_IO.Text_Streams
*
  Ada.Wide_Text_IO
*
  Ada.Wide_Text_IO.Complex_IO
*
  Ada.Wide_Text_IO.Text_Streams
*
  Ada.Wide_Wide_Text_IO
*
  Ada.Wide_Wide_Text_IO.Complex_IO
*
  Ada.Wide_Wide_Text_IO.Text_Streams
*
  Ada.Stream_IO
*
  Ada.Sequential_IO
*
  Ada.Direct_IO

are implemented using the C
library streams facility; where

*
  All files are opened using ``fopen``.
*
  All input/output operations use ``fread``/`fwrite`.

There is no internal buffering of any kind at the Ada library level. The only
buffering is that provided at the system level in the implementation of the
library routines that support streams. This facilitates shared use of these
streams by mixed language programs. Note though that system level buffering is
explicitly enabled at elaboration of the standard I/O packages and that can
have an impact on mixed language programs, in particular those using I/O before
calling the Ada elaboration routine (e.g., adainit). It is recommended to call
the Ada elaboration routine before performing any I/O or when impractical,
flush the common I/O streams and in particular Standard_Output before
elaborating the Ada code.

.. _FORM_Strings:

FORM Strings
============

The format of a FORM string in GNAT is:


::

  "keyword=value,keyword=value,...,keyword=value"


where letters may be in upper or lower case, and there are no spaces
between values.  The order of the entries is not important.  Currently
the following keywords defined.


::

  TEXT_TRANSLATION=[YES|NO|TEXT|BINARY|U8TEXT|WTEXT|U16TEXT]
  SHARED=[YES|NO]
  WCEM=[n|h|u|s|e|8|b]
  ENCODING=[UTF8|8BITS]


The use of these parameters is described later in this section. If an
unrecognized keyword appears in a form string, it is silently ignored
and not considered invalid.

.. _Direct_IO:

Direct_IO
=========

Direct_IO can only be instantiated for definite types.  This is a
restriction of the Ada language, which means that the records are fixed
length (the length being determined by ``type'Size``, rounded
up to the next storage unit boundary if necessary).

The records of a Direct_IO file are simply written to the file in index
sequence, with the first record starting at offset zero, and subsequent
records following.  There is no control information of any kind.  For
example, if 32-bit integers are being written, each record takes
4-bytes, so the record at index ``K`` starts at offset
(``K``-1)*4.

There is no limit on the size of Direct_IO files, they are expanded as
necessary to accommodate whatever records are written to the file.

.. _Sequential_IO:

Sequential_IO
=============

Sequential_IO may be instantiated with either a definite (constrained)
or indefinite (unconstrained) type.

For the definite type case, the elements written to the file are simply
the memory images of the data values with no control information of any
kind.  The resulting file should be read using the same type, no validity
checking is performed on input.

For the indefinite type case, the elements written consist of two
parts.  First is the size of the data item, written as the memory image
of a ``Interfaces.C.size_t`` value, followed by the memory image of
the data value.  The resulting file can only be read using the same
(unconstrained) type.  Normal assignment checks are performed on these
read operations, and if these checks fail, ``Data_Error`` is
raised.  In particular, in the array case, the lengths must match, and in
the variant record case, if the variable for a particular read operation
is constrained, the discriminants must match.

Note that it is not possible to use Sequential_IO to write variable
length array items, and then read the data back into different length
arrays.  For example, the following will raise ``Data_Error``:


.. code-block:: ada

   package IO is new Sequential_IO (String);
   F : IO.File_Type;
   S : String (1..4);
   ...
   IO.Create (F)
   IO.Write (F, "hello!")
   IO.Reset (F, Mode=>In_File);
   IO.Read (F, S);
   Put_Line (S);



On some Ada implementations, this will print ``hell``, but the program is
clearly incorrect, since there is only one element in the file, and that
element is the string ``hello!``.

In Ada 95 and Ada 2005, this kind of behavior can be legitimately achieved
using Stream_IO, and this is the preferred mechanism.  In particular, the
above program fragment rewritten to use Stream_IO will work correctly.

.. _Text_IO:

Text_IO
=======

Text_IO files consist of a stream of characters containing the following
special control characters:


::

  LF (line feed, 16#0A#) Line Mark
  FF (form feed, 16#0C#) Page Mark


A canonical Text_IO file is defined as one in which the following
conditions are met:

*
  The character ``LF`` is used only as a line mark, i.e., to mark the end
  of the line.

*
  The character ``FF`` is used only as a page mark, i.e., to mark the
  end of a page and consequently can appear only immediately following a
  ``LF`` (line mark) character.

*
  The file ends with either ``LF`` (line mark) or ``LF``-`FF`
  (line mark, page mark).  In the former case, the page mark is implicitly
  assumed to be present.

A file written using Text_IO will be in canonical form provided that no
explicit ``LF`` or ``FF`` characters are written using ``Put``
or ``Put_Line``.  There will be no ``FF`` character at the end of
the file unless an explicit ``New_Page`` operation was performed
before closing the file.

A canonical Text_IO file that is a regular file (i.e., not a device or a
pipe) can be read using any of the routines in Text_IO.  The
semantics in this case will be exactly as defined in the Ada Reference
Manual, and all the routines in Text_IO are fully implemented.

A text file that does not meet the requirements for a canonical Text_IO
file has one of the following:

*
  The file contains ``FF`` characters not immediately following a
  ``LF`` character.

*
  The file contains ``LF`` or ``FF`` characters written by
  ``Put`` or ``Put_Line``, which are not logically considered to be
  line marks or page marks.

*
  The file ends in a character other than ``LF`` or ``FF``,
  i.e., there is no explicit line mark or page mark at the end of the file.

Text_IO can be used to read such non-standard text files but subprograms
to do with line or page numbers do not have defined meanings.  In
particular, a ``FF`` character that does not follow a ``LF``
character may or may not be treated as a page mark from the point of
view of page and line numbering.  Every ``LF`` character is considered
to end a line, and there is an implied ``LF`` character at the end of
the file.

.. _Stream_Pointer_Positioning:

Stream Pointer Positioning
--------------------------

``Ada.Text_IO`` has a definition of current position for a file that
is being read.  No internal buffering occurs in Text_IO, and usually the
physical position in the stream used to implement the file corresponds
to this logical position defined by Text_IO.  There are two exceptions:

*
  After a call to ``End_Of_Page`` that returns ``True``, the stream
  is positioned past the ``LF`` (line mark) that precedes the page
  mark.  Text_IO maintains an internal flag so that subsequent read
  operations properly handle the logical position which is unchanged by
  the ``End_Of_Page`` call.

*
  After a call to ``End_Of_File`` that returns ``True``, if the
  Text_IO file was positioned before the line mark at the end of file
  before the call, then the logical position is unchanged, but the stream
  is physically positioned right at the end of file (past the line mark,
  and past a possible page mark following the line mark.  Again Text_IO
  maintains internal flags so that subsequent read operations properly
  handle the logical position.

These discrepancies have no effect on the observable behavior of
Text_IO, but if a single Ada stream is shared between a C program and
Ada program, or shared (using ``shared=yes`` in the form string)
between two Ada files, then the difference may be observable in some
situations.

.. _Reading_and_Writing_Non-Regular_Files:

Reading and Writing Non-Regular Files
-------------------------------------

A non-regular file is a device (such as a keyboard), or a pipe.  Text_IO
can be used for reading and writing.  Writing is not affected and the
sequence of characters output is identical to the normal file case, but
for reading, the behavior of Text_IO is modified to avoid undesirable
look-ahead as follows:

An input file that is not a regular file is considered to have no page
marks.  Any ``Ascii.FF`` characters (the character normally used for a
page mark) appearing in the file are considered to be data
characters.  In particular:

*
  ``Get_Line`` and ``Skip_Line`` do not test for a page mark
  following a line mark.  If a page mark appears, it will be treated as a
  data character.

*
  This avoids the need to wait for an extra character to be typed or
  entered from the pipe to complete one of these operations.

*
  ``End_Of_Page`` always returns ``False``

*
  ``End_Of_File`` will return ``False`` if there is a page mark at
  the end of the file.

Output to non-regular files is the same as for regular files.  Page marks
may be written to non-regular files using ``New_Page``, but as noted
above they will not be treated as page marks on input if the output is
piped to another Ada program.

Another important discrepancy when reading non-regular files is that the end
of file indication is not 'sticky'.  If an end of file is entered, e.g., by
pressing the :kbd:`EOT` key,
then end of file
is signaled once (i.e., the test ``End_Of_File``
will yield ``True``, or a read will
raise ``End_Error``), but then reading can resume
to read data past that end of
file indication, until another end of file indication is entered.

.. _Get_Immediate:

Get_Immediate
-------------

.. index:: Get_Immediate

Get_Immediate returns the next character (including control characters)
from the input file.  In particular, Get_Immediate will return LF or FF
characters used as line marks or page marks.  Such operations leave the
file positioned past the control character, and it is thus not treated
as having its normal function.  This means that page, line and column
counts after this kind of Get_Immediate call are set as though the mark
did not occur.  In the case where a Get_Immediate leaves the file
positioned between the line mark and page mark (which is not normally
possible), it is undefined whether the FF character will be treated as a
page mark.

.. _Treating_Text_IO_Files_as_Streams:

Treating Text_IO Files as Streams
---------------------------------

.. index:: Stream files

The package ``Text_IO.Streams`` allows a ``Text_IO`` file to be treated
as a stream.  Data written to a ``Text_IO`` file in this stream mode is
binary data.  If this binary data contains bytes 16#0A# (``LF``) or
16#0C# (``FF``), the resulting file may have non-standard
format.  Similarly if read operations are used to read from a Text_IO
file treated as a stream, then ``LF`` and ``FF`` characters may be
skipped and the effect is similar to that described above for
``Get_Immediate``.

.. _Text_IO_Extensions:

Text_IO Extensions
------------------

.. index:: Text_IO extensions

A package GNAT.IO_Aux in the GNAT library provides some useful extensions
to the standard ``Text_IO`` package:

* function File_Exists (Name : String) return Boolean;
  Determines if a file of the given name exists.

* function Get_Line return String;
  Reads a string from the standard input file.  The value returned is exactly
  the length of the line that was read.

* function Get_Line (File : Ada.Text_IO.File_Type) return String;
  Similar, except that the parameter File specifies the file from which
  the string is to be read.


.. _Text_IO_Facilities_for_Unbounded_Strings:

Text_IO Facilities for Unbounded Strings
----------------------------------------

.. index:: Text_IO for unbounded strings

.. index:: Unbounded_String, Text_IO operations

The package ``Ada.Strings.Unbounded.Text_IO``
in library files :file:`a-suteio.ads/adb` contains some GNAT-specific
subprograms useful for Text_IO operations on unbounded strings:


* function Get_Line (File : File_Type) return Unbounded_String;
  Reads a line from the specified file
  and returns the result as an unbounded string.

* procedure Put (File : File_Type; U : Unbounded_String);
  Writes the value of the given unbounded string to the specified file
  Similar to the effect of
  ``Put (To_String (U))`` except that an extra copy is avoided.

* procedure Put_Line (File : File_Type; U : Unbounded_String);
  Writes the value of the given unbounded string to the specified file,
  followed by a ``New_Line``.
  Similar to the effect of ``Put_Line (To_String (U))`` except
  that an extra copy is avoided.

In the above procedures, ``File`` is of type ``Ada.Text_IO.File_Type``
and is optional.  If the parameter is omitted, then the standard input or
output file is referenced as appropriate.

The package ``Ada.Strings.Wide_Unbounded.Wide_Text_IO`` in library
files :file:`a-swuwti.ads` and :file:`a-swuwti.adb` provides similar extended
``Wide_Text_IO`` functionality for unbounded wide strings.

The package ``Ada.Strings.Wide_Wide_Unbounded.Wide_Wide_Text_IO`` in library
files :file:`a-szuzti.ads` and :file:`a-szuzti.adb` provides similar extended
``Wide_Wide_Text_IO`` functionality for unbounded wide wide strings.

.. _Wide_Text_IO:

Wide_Text_IO
============

``Wide_Text_IO`` is similar in most respects to Text_IO, except that
both input and output files may contain special sequences that represent
wide character values.  The encoding scheme for a given file may be
specified using a FORM parameter:


::

  WCEM=`x`


as part of the FORM string (WCEM = wide character encoding method),
where ``x`` is one of the following characters

========== ====================
Character  Encoding
========== ====================
*h*        Hex ESC encoding
*u*        Upper half encoding
*s*        Shift-JIS encoding
*e*        EUC Encoding
*8*        UTF-8 encoding
*b*        Brackets encoding
========== ====================

The encoding methods match those that
can be used in a source
program, but there is no requirement that the encoding method used for
the source program be the same as the encoding method used for files,
and different files may use different encoding methods.

The default encoding method for the standard files, and for opened files
for which no WCEM parameter is given in the FORM string matches the
wide character encoding specified for the main program (the default
being brackets encoding if no coding method was specified with -gnatW).



*Hex Coding*
  In this encoding, a wide character is represented by a five character
  sequence:


::

    ESC a b c d

..

  where ``a``, ``b``, ``c``, ``d`` are the four hexadecimal
  characters (using upper case letters) of the wide character code.  For
  example, ESC A345 is used to represent the wide character with code
  16#A345#.  This scheme is compatible with use of the full
  ``Wide_Character`` set.


*Upper Half Coding*
  The wide character with encoding 16#abcd#, where the upper bit is on
  (i.e., a is in the range 8-F) is represented as two bytes 16#ab# and
  16#cd#.  The second byte may never be a format control character, but is
  not required to be in the upper half.  This method can be also used for
  shift-JIS or EUC where the internal coding matches the external coding.


*Shift JIS Coding*
  A wide character is represented by a two character sequence 16#ab# and
  16#cd#, with the restrictions described for upper half encoding as
  described above.  The internal character code is the corresponding JIS
  character according to the standard algorithm for Shift-JIS
  conversion.  Only characters defined in the JIS code set table can be
  used with this encoding method.


*EUC Coding*
  A wide character is represented by a two character sequence 16#ab# and
  16#cd#, with both characters being in the upper half.  The internal
  character code is the corresponding JIS character according to the EUC
  encoding algorithm.  Only characters defined in the JIS code set table
  can be used with this encoding method.


*UTF-8 Coding*
  A wide character is represented using
  UCS Transformation Format 8 (UTF-8) as defined in Annex R of ISO
  10646-1/Am.2.  Depending on the character value, the representation
  is a one, two, or three byte sequence:


::

    16#0000#-16#007f#: 2#0xxxxxxx#
    16#0080#-16#07ff#: 2#110xxxxx# 2#10xxxxxx#
    16#0800#-16#ffff#: 2#1110xxxx# 2#10xxxxxx# 2#10xxxxxx#

..

  where the ``xxx`` bits correspond to the left-padded bits of the
  16-bit character value.  Note that all lower half ASCII characters
  are represented as ASCII bytes and all upper half characters and
  other wide characters are represented as sequences of upper-half
  (The full UTF-8 scheme allows for encoding 31-bit characters as
  6-byte sequences, but in this implementation, all UTF-8 sequences
  of four or more bytes length will raise a Constraint_Error, as
  will all invalid UTF-8 sequences.)


*Brackets Coding*
  In this encoding, a wide character is represented by the following eight
  character sequence:


::

    [ " a b c d " ]

..

  where ``a``, ``b``, ``c``, ``d`` are the four hexadecimal
  characters (using uppercase letters) of the wide character code.  For
  example, ``["A345"]`` is used to represent the wide character with code
  ``16#A345#``.
  This scheme is compatible with use of the full Wide_Character set.
  On input, brackets coding can also be used for upper half characters,
  e.g., ``["C1"]`` for lower case a.  However, on output, brackets notation
  is only used for wide characters with a code greater than ``16#FF#``.

  Note that brackets coding is not normally used in the context of
  Wide_Text_IO or Wide_Wide_Text_IO, since it is really just designed as
  a portable way of encoding source files. In the context of Wide_Text_IO
  or Wide_Wide_Text_IO, it can only be used if the file does not contain
  any instance of the left bracket character other than to encode wide
  character values using the brackets encoding method. In practice it is
  expected that some standard wide character encoding method such
  as UTF-8 will be used for text input output.

  If brackets notation is used, then any occurrence of a left bracket
  in the input file which is not the start of a valid wide character
  sequence will cause Constraint_Error to be raised. It is possible to
  encode a left bracket as ["5B"] and Wide_Text_IO and Wide_Wide_Text_IO
  input will interpret this as a left bracket.

  However, when a left bracket is output, it will be output as a left bracket
  and not as ["5B"]. We make this decision because for normal use of
  Wide_Text_IO for outputting messages, it is unpleasant to clobber left
  brackets. For example, if we write:


  .. code-block:: ada

       Put_Line ("Start of output [first run]");


  we really do not want to have the left bracket in this message clobbered so
  that the output reads:


::

       Start of output ["5B"]first run]

..

  In practice brackets encoding is reasonably useful for normal Put_Line use
  since we won't get confused between left brackets and wide character
  sequences in the output. But for input, or when files are written out
  and read back in, it really makes better sense to use one of the standard
  encoding methods such as UTF-8.


For the coding schemes other than UTF-8, Hex, or Brackets encoding,
not all wide character
values can be represented.  An attempt to output a character that cannot
be represented using the encoding scheme for the file causes
Constraint_Error to be raised.  An invalid wide character sequence on
input also causes Constraint_Error to be raised.

.. _Stream_Pointer_Positioning_1:

Stream Pointer Positioning
--------------------------

``Ada.Wide_Text_IO`` is similar to ``Ada.Text_IO`` in its handling
of stream pointer positioning (:ref:`Text_IO`).  There is one additional
case:

If ``Ada.Wide_Text_IO.Look_Ahead`` reads a character outside the
normal lower ASCII set (i.e., a character in the range:


.. code-block:: ada

  Wide_Character'Val (16#0080#) .. Wide_Character'Val (16#FFFF#)


then although the logical position of the file pointer is unchanged by
the ``Look_Ahead`` call, the stream is physically positioned past the
wide character sequence.  Again this is to avoid the need for buffering
or backup, and all ``Wide_Text_IO`` routines check the internal
indication that this situation has occurred so that this is not visible
to a normal program using ``Wide_Text_IO``.  However, this discrepancy
can be observed if the wide text file shares a stream with another file.

.. _Reading_and_Writing_Non-Regular_Files_1:

Reading and Writing Non-Regular Files
-------------------------------------

As in the case of Text_IO, when a non-regular file is read, it is
assumed that the file contains no page marks (any form characters are
treated as data characters), and ``End_Of_Page`` always returns
``False``.  Similarly, the end of file indication is not sticky, so
it is possible to read beyond an end of file.

.. _Wide_Wide_Text_IO:

Wide_Wide_Text_IO
=================

``Wide_Wide_Text_IO`` is similar in most respects to Text_IO, except that
both input and output files may contain special sequences that represent
wide wide character values.  The encoding scheme for a given file may be
specified using a FORM parameter:


::

  WCEM=`x`


as part of the FORM string (WCEM = wide character encoding method),
where ``x`` is one of the following characters

========== ====================
Character  Encoding
========== ====================
*h*        Hex ESC encoding
*u*        Upper half encoding
*s*        Shift-JIS encoding
*e*        EUC Encoding
*8*        UTF-8 encoding
*b*        Brackets encoding
========== ====================


The encoding methods match those that
can be used in a source
program, but there is no requirement that the encoding method used for
the source program be the same as the encoding method used for files,
and different files may use different encoding methods.

The default encoding method for the standard files, and for opened files
for which no WCEM parameter is given in the FORM string matches the
wide character encoding specified for the main program (the default
being brackets encoding if no coding method was specified with -gnatW).



*UTF-8 Coding*
  A wide character is represented using
  UCS Transformation Format 8 (UTF-8) as defined in Annex R of ISO
  10646-1/Am.2.  Depending on the character value, the representation
  is a one, two, three, or four byte sequence:


::

    16#000000#-16#00007f#: 2#0xxxxxxx#
    16#000080#-16#0007ff#: 2#110xxxxx# 2#10xxxxxx#
    16#000800#-16#00ffff#: 2#1110xxxx# 2#10xxxxxx# 2#10xxxxxx#
    16#010000#-16#10ffff#: 2#11110xxx# 2#10xxxxxx# 2#10xxxxxx# 2#10xxxxxx#

..

  where the ``xxx`` bits correspond to the left-padded bits of the
  21-bit character value.  Note that all lower half ASCII characters
  are represented as ASCII bytes and all upper half characters and
  other wide characters are represented as sequences of upper-half
  characters.


*Brackets Coding*
  In this encoding, a wide wide character is represented by the following eight
  character sequence if is in wide character range


::

    [ " a b c d " ]

..

  and by the following ten character sequence if not


::

    [ " a b c d e f " ]

..

  where ``a``, ``b``, ``c``, ``d``, ``e``, and ``f``
  are the four or six hexadecimal
  characters (using uppercase letters) of the wide wide character code.  For
  example, ``["01A345"]`` is used to represent the wide wide character
  with code ``16#01A345#``.

  This scheme is compatible with use of the full Wide_Wide_Character set.
  On input, brackets coding can also be used for upper half characters,
  e.g., ``["C1"]`` for lower case a.  However, on output, brackets notation
  is only used for wide characters with a code greater than ``16#FF#``.


If is also possible to use the other Wide_Character encoding methods,
such as Shift-JIS, but the other schemes cannot support the full range
of wide wide characters.
An attempt to output a character that cannot
be represented using the encoding scheme for the file causes
Constraint_Error to be raised.  An invalid wide character sequence on
input also causes Constraint_Error to be raised.

.. _Stream_Pointer_Positioning_2:

Stream Pointer Positioning
--------------------------

``Ada.Wide_Wide_Text_IO`` is similar to ``Ada.Text_IO`` in its handling
of stream pointer positioning (:ref:`Text_IO`).  There is one additional
case:

If ``Ada.Wide_Wide_Text_IO.Look_Ahead`` reads a character outside the
normal lower ASCII set (i.e., a character in the range:


.. code-block:: ada

  Wide_Wide_Character'Val (16#0080#) .. Wide_Wide_Character'Val (16#10FFFF#)


then although the logical position of the file pointer is unchanged by
the ``Look_Ahead`` call, the stream is physically positioned past the
wide character sequence.  Again this is to avoid the need for buffering
or backup, and all ``Wide_Wide_Text_IO`` routines check the internal
indication that this situation has occurred so that this is not visible
to a normal program using ``Wide_Wide_Text_IO``.  However, this discrepancy
can be observed if the wide text file shares a stream with another file.

.. _Reading_and_Writing_Non-Regular_Files_2:

Reading and Writing Non-Regular Files
-------------------------------------

As in the case of Text_IO, when a non-regular file is read, it is
assumed that the file contains no page marks (any form characters are
treated as data characters), and ``End_Of_Page`` always returns
``False``.  Similarly, the end of file indication is not sticky, so
it is possible to read beyond an end of file.

.. _Stream_IO:

Stream_IO
=========

A stream file is a sequence of bytes, where individual elements are
written to the file as described in the Ada Reference Manual.  The type
``Stream_Element`` is simply a byte.  There are two ways to read or
write a stream file.

*
  The operations ``Read`` and ``Write`` directly read or write a
  sequence of stream elements with no control information.

*
  The stream attributes applied to a stream file transfer data in the
  manner described for stream attributes.

.. _Text_Translation:

Text Translation
================

``Text_Translation=xxx`` may be used as the Form parameter
passed to Text_IO.Create and Text_IO.Open. ``Text_Translation=xxx``
has no effect on Unix systems. Possible values are:


*
  ``Yes`` or ``Text`` is the default, which means to
  translate LF to/from CR/LF on Windows systems.

  ``No`` disables this translation; i.e. it
  uses binary mode. For output files, ``Text_Translation=No``
  may be used to create Unix-style files on
  Windows.

*
  ``wtext`` translation enabled in Unicode mode.
  (corresponds to _O_WTEXT).

*
  ``u8text`` translation enabled in Unicode UTF-8 mode.
  (corresponds to O_U8TEXT).

*
  ``u16text`` translation enabled in Unicode UTF-16
  mode. (corresponds to_O_U16TEXT).


.. _Shared_Files:

Shared Files
============

Section A.14 of the Ada Reference Manual allows implementations to
provide a wide variety of behavior if an attempt is made to access the
same external file with two or more internal files.

To provide a full range of functionality, while at the same time
minimizing the problems of portability caused by this implementation
dependence, GNAT handles file sharing as follows:

*
  In the absence of a ``shared=xxx`` form parameter, an attempt
  to open two or more files with the same full name is considered an error
  and is not supported.  The exception ``Use_Error`` will be
  raised.  Note that a file that is not explicitly closed by the program
  remains open until the program terminates.

*
  If the form parameter ``shared=no`` appears in the form string, the
  file can be opened or created with its own separate stream identifier,
  regardless of whether other files sharing the same external file are
  opened.  The exact effect depends on how the C stream routines handle
  multiple accesses to the same external files using separate streams.

*
  If the form parameter ``shared=yes`` appears in the form string for
  each of two or more files opened using the same full name, the same
  stream is shared between these files, and the semantics are as described
  in Ada Reference Manual, Section A.14.

When a program that opens multiple files with the same name is ported
from another Ada compiler to GNAT, the effect will be that
``Use_Error`` is raised.

The documentation of the original compiler and the documentation of the
program should then be examined to determine if file sharing was
expected, and ``shared=xxx`` parameters added to ``Open``
and ``Create`` calls as required.

When a program is ported from GNAT to some other Ada compiler, no
special attention is required unless the ``shared=xxx`` form
parameter is used in the program.  In this case, you must examine the
documentation of the new compiler to see if it supports the required
file sharing semantics, and form strings modified appropriately.  Of
course it may be the case that the program cannot be ported if the
target compiler does not support the required functionality.  The best
approach in writing portable code is to avoid file sharing (and hence
the use of the ``shared=xxx`` parameter in the form string)
completely.

One common use of file sharing in Ada 83 is the use of instantiations of
Sequential_IO on the same file with different types, to achieve
heterogeneous input-output.  Although this approach will work in GNAT if
``shared=yes`` is specified, it is preferable in Ada to use Stream_IO
for this purpose (using the stream attributes)

.. _Filenames_encoding:

Filenames encoding
==================

An encoding form parameter can be used to specify the filename
encoding ``encoding=xxx``.

*
  If the form parameter ``encoding=utf8`` appears in the form string, the
  filename must be encoded in UTF-8.

*
  If the form parameter ``encoding=8bits`` appears in the form
  string, the filename must be a standard 8bits string.

In the absence of a ``encoding=xxx`` form parameter, the
encoding is controlled by the ``GNAT_CODE_PAGE`` environment
variable. And if not set ``utf8`` is assumed.



*CP_ACP*
  The current system Windows ANSI code page.

*CP_UTF8*
  UTF-8 encoding

This encoding form parameter is only supported on the Windows
platform. On the other Operating Systems the run-time is supporting
UTF-8 natively.

.. _File_content_encoding:

File content encoding
=====================

For text files it is possible to specify the encoding to use. This is
controlled by the by the ``GNAT_CCS_ENCODING`` environment
variable. And if not set ``TEXT`` is assumed.

The possible values are those supported on Windows:



*TEXT*
  Translated text mode

*WTEXT*
  Translated unicode encoding

*U16TEXT*
  Unicode 16-bit encoding

*U8TEXT*
  Unicode 8-bit encoding

This encoding is only supported on the Windows platform.

.. _Open_Modes:

Open Modes
==========

``Open`` and ``Create`` calls result in a call to ``fopen``
using the mode shown in the following table:

+----------------------------+---------------+------------------+
|           ``Open`` and ``Create`` Call Modes                  |
+----------------------------+---------------+------------------+
|                            |   **OPEN**    |     **CREATE**   |
+============================+===============+==================+
| Append_File                |   "r+"        |    "w+"          |
+----------------------------+---------------+------------------+
| In_File                    |   "r"         |    "w+"          |
+----------------------------+---------------+------------------+
| Out_File (Direct_IO)       |   "r+"        |    "w"           |
+----------------------------+---------------+------------------+
| Out_File (all other cases) |   "w"         |    "w"           |
+----------------------------+---------------+------------------+
| Inout_File                 |   "r+"        |    "w+"          |
+----------------------------+---------------+------------------+


If text file translation is required, then either ``b`` or ``t``
is added to the mode, depending on the setting of Text.  Text file
translation refers to the mapping of CR/LF sequences in an external file
to LF characters internally.  This mapping only occurs in DOS and
DOS-like systems, and is not relevant to other systems.

A special case occurs with Stream_IO.  As shown in the above table, the
file is initially opened in ``r`` or ``w`` mode for the
``In_File`` and ``Out_File`` cases.  If a ``Set_Mode`` operation
subsequently requires switching from reading to writing or vice-versa,
then the file is reopened in ``r+`` mode to permit the required operation.

.. _Operations_on_C_Streams:

Operations on C Streams
=======================

The package ``Interfaces.C_Streams`` provides an Ada program with direct
access to the C library functions for operations on C streams:


.. code-block:: ada

  package Interfaces.C_Streams is
    -- Note: the reason we do not use the types that are in
    -- Interfaces.C is that we want to avoid dragging in the
    -- code in this unit if possible.
    subtype chars is System.Address;
    -- Pointer to null-terminated array of characters
    subtype FILEs is System.Address;
    -- Corresponds to the C type FILE*
    subtype voids is System.Address;
    -- Corresponds to the C type void*
    subtype int is Integer;
    subtype long is Long_Integer;
    -- Note: the above types are subtypes deliberately, and it
    -- is part of this spec that the above correspondences are
    -- guaranteed.  This means that it is legitimate to, for
    -- example, use Integer instead of int.  We provide these
    -- synonyms for clarity, but in some cases it may be
    -- convenient to use the underlying types (for example to
    -- avoid an unnecessary dependency of a spec on the spec
    -- of this unit).
    type size_t is mod 2 ** Standard'Address_Size;
    NULL_Stream : constant FILEs;
    -- Value returned (NULL in C) to indicate an
    -- fdopen/fopen/tmpfile error
    ----------------------------------
    -- Constants Defined in stdio.h --
    ----------------------------------
    EOF : constant int;
    -- Used by a number of routines to indicate error or
    -- end of file
    IOFBF : constant int;
    IOLBF : constant int;
    IONBF : constant int;
    -- Used to indicate buffering mode for setvbuf call
    SEEK_CUR : constant int;
    SEEK_END : constant int;
    SEEK_SET : constant int;
    -- Used to indicate origin for fseek call
    function stdin return FILEs;
    function stdout return FILEs;
    function stderr return FILEs;
    -- Streams associated with standard files
    --------------------------
    -- Standard C functions --
    --------------------------
    -- The functions selected below are ones that are
    -- available in UNIX (but not necessarily in ANSI C).
    -- These are very thin interfaces
    -- which copy exactly the C headers.  For more
    -- documentation on these functions, see the Microsoft C
    -- "Run-Time Library Reference" (Microsoft Press, 1990,
    -- ISBN 1-55615-225-6), which includes useful information
    -- on system compatibility.
    procedure clearerr (stream : FILEs);
    function fclose (stream : FILEs) return int;
    function fdopen (handle : int; mode : chars) return FILEs;
    function feof (stream : FILEs) return int;
    function ferror (stream : FILEs) return int;
    function fflush (stream : FILEs) return int;
    function fgetc (stream : FILEs) return int;
    function fgets (strng : chars; n : int; stream : FILEs)
        return chars;
    function fileno (stream : FILEs) return int;
    function fopen (filename : chars; Mode : chars)
        return FILEs;
    -- Note: to maintain target independence, use
    -- text_translation_required, a boolean variable defined in
    -- a-sysdep.c to deal with the target dependent text
    -- translation requirement.  If this variable is set,
    -- then  b/t should be appended to the standard mode
    -- argument to set the text translation mode off or on
    -- as required.
    function fputc (C : int; stream : FILEs) return int;
    function fputs (Strng : chars; Stream : FILEs) return int;
    function fread
       (buffer : voids;
        size : size_t;
        count : size_t;
        stream : FILEs)
        return size_t;
    function freopen
       (filename : chars;
        mode : chars;
        stream : FILEs)
        return FILEs;
    function fseek
       (stream : FILEs;
        offset : long;
        origin : int)
        return int;
    function ftell (stream : FILEs) return long;
    function fwrite
       (buffer : voids;
        size : size_t;
        count : size_t;
        stream : FILEs)
        return size_t;
    function isatty (handle : int) return int;
    procedure mktemp (template : chars);
    -- The return value (which is just a pointer to template)
    -- is discarded
    procedure rewind (stream : FILEs);
    function rmtmp return int;
    function setvbuf
       (stream : FILEs;
        buffer : chars;
        mode : int;
        size : size_t)
        return int;

    function tmpfile return FILEs;
    function ungetc (c : int; stream : FILEs) return int;
    function unlink (filename : chars) return int;
    ---------------------
    -- Extra functions --
    ---------------------
    -- These functions supply slightly thicker bindings than
    -- those above.  They are derived from functions in the
    -- C Run-Time Library, but may do a bit more work than
    -- just directly calling one of the Library functions.
    function is_regular_file (handle : int) return int;
    -- Tests if given handle is for a regular file (result 1)
    -- or for a non-regular file (pipe or device, result 0).
    ---------------------------------
    -- Control of Text/Binary Mode --
    ---------------------------------
    -- If text_translation_required is true, then the following
    -- functions may be used to dynamically switch a file from
    -- binary to text mode or vice versa.  These functions have
    -- no effect if text_translation_required is false (i.e., in
    -- normal UNIX mode).  Use fileno to get a stream handle.
    procedure set_binary_mode (handle : int);
    procedure set_text_mode (handle : int);
    ----------------------------
    -- Full Path Name support --
    ----------------------------
    procedure full_name (nam : chars; buffer : chars);
    -- Given a NUL terminated string representing a file
    -- name, returns in buffer a NUL terminated string
    -- representing the full path name for the file name.
    -- On systems where it is relevant the   drive is also
    -- part of the full path name.  It is the responsibility
    -- of the caller to pass an actual parameter for buffer
    -- that is big enough for any full path name.  Use
    -- max_path_len given below as the size of buffer.
    max_path_len : integer;
    -- Maximum length of an allowable full path name on the
    -- system, including a terminating NUL character.
  end Interfaces.C_Streams;


.. _Interfacing_to_C_Streams:

Interfacing to C Streams
========================

The packages in this section permit interfacing Ada files to C Stream
operations.


.. code-block:: ada

   with Interfaces.C_Streams;
   package Ada.Sequential_IO.C_Streams is
      function C_Stream (F : File_Type)
         return Interfaces.C_Streams.FILEs;
      procedure Open
        (File : in out File_Type;
         Mode : in File_Mode;
         C_Stream : in Interfaces.C_Streams.FILEs;
         Form : in String := "");
   end Ada.Sequential_IO.C_Streams;

    with Interfaces.C_Streams;
    package Ada.Direct_IO.C_Streams is
       function C_Stream (F : File_Type)
          return Interfaces.C_Streams.FILEs;
       procedure Open
         (File : in out File_Type;
          Mode : in File_Mode;
          C_Stream : in Interfaces.C_Streams.FILEs;
          Form : in String := "");
    end Ada.Direct_IO.C_Streams;

    with Interfaces.C_Streams;
    package Ada.Text_IO.C_Streams is
       function C_Stream (F : File_Type)
          return Interfaces.C_Streams.FILEs;
       procedure Open
         (File : in out File_Type;
          Mode : in File_Mode;
          C_Stream : in Interfaces.C_Streams.FILEs;
          Form : in String := "");
    end Ada.Text_IO.C_Streams;

    with Interfaces.C_Streams;
    package Ada.Wide_Text_IO.C_Streams is
       function C_Stream (F : File_Type)
          return Interfaces.C_Streams.FILEs;
       procedure Open
         (File : in out File_Type;
          Mode : in File_Mode;
          C_Stream : in Interfaces.C_Streams.FILEs;
          Form : in String := "");
   end Ada.Wide_Text_IO.C_Streams;

    with Interfaces.C_Streams;
    package Ada.Wide_Wide_Text_IO.C_Streams is
       function C_Stream (F : File_Type)
          return Interfaces.C_Streams.FILEs;
       procedure Open
         (File : in out File_Type;
          Mode : in File_Mode;
          C_Stream : in Interfaces.C_Streams.FILEs;
          Form : in String := "");
   end Ada.Wide_Wide_Text_IO.C_Streams;

   with Interfaces.C_Streams;
   package Ada.Stream_IO.C_Streams is
      function C_Stream (F : File_Type)
         return Interfaces.C_Streams.FILEs;
      procedure Open
        (File : in out File_Type;
         Mode : in File_Mode;
         C_Stream : in Interfaces.C_Streams.FILEs;
         Form : in String := "");
   end Ada.Stream_IO.C_Streams;


In each of these six packages, the ``C_Stream`` function obtains the
``FILE`` pointer from a currently opened Ada file.  It is then
possible to use the ``Interfaces.C_Streams`` package to operate on
this stream, or the stream can be passed to a C program which can
operate on it directly.  Of course the program is responsible for
ensuring that only appropriate sequences of operations are executed.

One particular use of relevance to an Ada program is that the
``setvbuf`` function can be used to control the buffering of the
stream used by an Ada file.  In the absence of such a call the standard
default buffering is used.

The ``Open`` procedures in these packages open a file giving an
existing C Stream instead of a file name.  Typically this stream is
imported from a C program, allowing an Ada file to operate on an
existing C file.