File: README.md

package info (click to toggle)
python-vertica 1.4.0-2
  • links: PTS, VCS
  • area: contrib
  • in suites: forky, sid
  • size: 948 kB
  • sloc: python: 6,914; makefile: 4
file content (1220 lines) | stat: -rw-r--r-- 57,669 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
# vertica-python

[![PyPI version](https://img.shields.io/pypi/v/vertica-python?color=brightgreen&label=PyPI%20package)](https://pypi.org/project/vertica-python/)
[![Conda Version](https://img.shields.io/conda/vn/conda-forge/vertica-python?color=yellowgreen)](https://anaconda.org/conda-forge/vertica-python)
[![License](https://img.shields.io/badge/License-Apache%202.0-orange.svg)](https://opensource.org/licenses/Apache-2.0)
[![Python Version](https://img.shields.io/pypi/pyversions/vertica-python.svg)](https://www.python.org/downloads/)
[![Downloads](https://pepy.tech/badge/vertica-python/week)](https://pepy.tech/project/vertica-python)

*vertica-python* is a native Python client for the Vertica (http://www.vertica.com) database. *vertica-python* is the replacement of the deprecated Python client *vertica_db_client*, which was removed since Vertica server version 9.3.

:loudspeaker: 08/14/2018: *vertica-python* becomes Vertica’s first officially supported open source database client, see the blog [here](https://my.vertica.com/blog/vertica-python-becomes-verticas-first-officially-supported-open-source-database-client/).

Please check out [release notes](https://github.com/vertica/vertica-python/releases) to learn about the latest improvements.

vertica-python has been tested with Vertica 24.2.0 and Python 3.7/3.8/3.9/3.10/3.11/3.12. Feel free to submit issues and/or pull requests (Read up on our [contributing guidelines](#contributing-guidelines)).


## Installation

To install vertica-python with pip:
```bash
# Latest release version
pip install vertica-python

# Latest commit on master branch
pip install git+https://github.com/vertica/vertica-python.git@master
```

To install vertica-python from source, run the following command from the root directory:

    python setup.py install

Source code for vertica-python can be found at:

    https://github.com/vertica/vertica-python

#### Using Kerberos authentication
vertica-python has optional Kerberos authentication support for Unix-like systems, which requires you to install the [kerberos](https://pypi.org/project/kerberos/) package:

    pip install kerberos

Note that `kerberos` is a python extension module, which means you need to install `python-dev`. The command depends on the package manager and will look like

    sudo [yum|apt-get|etc] install python-dev
    
Then see [this section](#kerberos-authentication) for how to config Kerberos for a connection.

## Usage
:scroll: The basic vertica-python usage is common to all the database adapters implementing the [DB-API v2.0](https://www.python.org/dev/peps/pep-0249/) protocol.

### Create a connection

The example below shows how to create a `Connection` object:

```python
import vertica_python

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             # autogenerated session label by default,
             'session_label': 'some_label',
             # default throw error on invalid UTF-8 results
             'unicode_error': 'strict',
             # SSL is disabled by default
             'ssl': False,
             # autocommit is off by default
             'autocommit': True,
             # using server-side prepared statements is disabled by default
             'use_prepared_statements': False,
             # connection timeout is not enabled by default
             # 5 seconds timeout for a socket operation (Establishing a TCP connection or read/write operation)
             'connection_timeout': 5}

# simple connection, with manual close
try:
    connection = vertica_python.connect(**conn_info)
    # do things
finally:
    connection.close()

# using `with` for auto connection closing after usage
with vertica_python.connect(**conn_info) as connection:
    # do things
```

| Connection Option  |  Description |
| ------------- | ------------- |
| host     | The server host of the connection. This can be a host name or an IP address. <br>**_Default_**: "localhost" |
| port     | The port of the connection. <br>**_Default_**: 5433 |
| user     | The database user name to use to connect to the database. <br>**_Default_**:<br>&nbsp;&nbsp;&nbsp;&nbsp;(for non-OAuth connections) OS login user name  <br>&nbsp;&nbsp;&nbsp;&nbsp;(for OAuth connections) "" |
| password | The password to use to log into the database. <br>**_Default_**: "" |
| database | The database name. <br>**_Default_**: "" |
| autocommit | See [Autocommit](#autocommit). <br>**_Default_**: False |
| backup_server_node | See [Connection Failover](#connection-failover). <br>**_Default_**: [] |
| binary_transfer | See [Data Transfer Format](#data-transfer-format). <br>**_Default_**: False (use text format transfer) |
| connection_load_balance | See [Connection Load Balancing](#connection-load-balancing). <br>**_Default_**: False (disabled) |
| connection_timeout | The number of seconds (can be a nonnegative floating point number) the client waits for a socket operation (Establishing a TCP connection or read/write operation). <br>**_Default_**: None (no timeout) |
| disable_copy_local | See [COPY FROM LOCAL](#method-2-copy-from-local-sql-with-cursorexecute). <br>**_Default_**: False |
| kerberos_host_name | See [Kerberos Authentication](#kerberos-authentication). <br>**_Default_**: the value of connection option `host` |
| kerberos_service_name | See [Kerberos Authentication](#kerberos-authentication). <br>**_Default_**: "vertica" |
| log_level | See [Logging](#logging). |
| log_path | See [Logging](#logging). |
| oauth_access_token | See [OAuth Authentication](#oauth-authentication). <br>**_Default_**: "" |
| request_complex_types | See [SQL Data conversion to Python objects](#sql-data-conversion-to-python-objects). <br>**_Default_**: True |
| session_label | Sets a label for the connection on the server. This value appears in the client_label column of the _v_monitor.sessions_ system table. <br>**_Default_**: an auto-generated label with format of `vertica-python-{version}-{random_uuid}` |
| ssl | See [TLS/SSL](#tlsssl). <br>**_Default_**: None (tlsmode="prefer") |
| tlsmode | Controls whether the connection to the server uses TLS encryption. <br>See [TLS/SSL](#tlsssl). <br>**_Default_**: "prefer" |
| tls_cafile | The name of a file containing trusted SSL certificate authority (CA) certificate(s). <br>See [TLS/SSL](#tlsssl). |
| tls_certfile | The name of a file containing client's certificate(s). <br>See [TLS/SSL](#tlsssl). |
| tls_keyfile | The name of a file containing client's private key. <br>See [TLS/SSL](#tlsssl). |
| unicode_error | See [UTF-8 encoding issues](#utf-8-encoding-issues). <br>**_Default_**: 'strict' (throw error on invalid UTF-8 results) |
| use_prepared_statements | See [Passing parameters to SQL queries](#passing-parameters-to-sql-queries). <br>**_Default_**: False |
| workload | Sets the workload name associated with this session. Valid values are workload names that already exist in a workload routing rule on the server. If a workload name that doesn't exist is entered, the server will reject it and it will be set to the default. <br>**_Default_**: "" |
| dsn | See [Set Properties with Connection String](#set-properties-with-connection-string). |


Below are a few important connection topics you may deal with, or you can skip and jump to the next section: [Send Queries and Retrieve Results](#send-queries-and-retrieve-results)

#### Set Properties with Connection String
Another way to set connection properties is passing a connection string to the keyword parameter `dsn` of `vertica_python.connect(dsn='...', **kwargs)`. The connection string is of the form:
```
vertica://(user):(password)@(host):(port)/(database)?(arg1=val1&arg2=val2&...)
```
The connection string would be parsed by `vertica_python.parse_dsn(connection_str)`, and the parsing result (a dictionary of keywords and values) would be merged with _kwargs_. If the same keyword is specified in both the sources, the _kwargs_ value overrides the parsed _dsn_ value. The `(arg1=val1&arg2=val2&...)` section can handle string/numeric/boolean values, blank and invalid value would be ignored.

```python
import vertica_python

connection_str = ('vertica://admin@localhost:5433/db1?connection_load_balance=True&connection_timeout=1.5&'
                  'session_label=vpclient+123%7E456')
print(vertica_python.parse_dsn(connection_str))
# {'user': 'admin', 'host': 'localhost', 'port': 5433, 'database': 'db1',
#  'connection_load_balance': True, 'connection_timeout': 1.5, 'session_label': 'vpclient 123~456'}

additional_info = {
    'password': 'some_password', 
    'backup_server_node': ['10.6.7.123', ('10.20.82.77', 6000)]  # invalid value to be set in a connection string
    }

with vertica_python.connect(dsn=connection_str, **additional_info) as conn:
   # do things
```

#### TLS/SSL

There are two options to control client-server TLS: `tlsmode` and `ssl`. If both are set, `tlsmode` takes precedence.

`ssl` can be a bool or a `ssl.SSLContext` object. Here is the value mapping between `ssl` (exclude `ssl.SSLContext`) and `tlsmode`:

| `tlsmode` |  `ssl` | Description |
| ------------- | ------------- | ---|
| 'disable'     | False | only try a non-TLS connection. |
| 'prefer'      | (not set) | (Default) first try a TLS connection; if TLS is disabled on the server, then fallback to a non-TLS connection. <br>Note: If TLS is enabled on the server and TLS connection fails, the client rejects the connection. |
| 'require'     | True |  connects using TLS without verifying certificates. If the TLS connection attempt fails, the client rejects the connection. |
| 'verify-ca'   || connects using TLS and confirms that the server certificate has been signed by a trusted certificate authority. |
| 'verify-full' || connects using TLS, confirms that the server certificate has been signed by a trusted certificate authority, and verifies that the host name matches the name provided in the server certificate. |

When `tlsmode` is 'verify-ca' or 'verify-full', these options take certificate/key files: `tls_cafile`, `tls_certfile` and `tls_keyfile`. Otherwise, these options are ignored.

`tlsmode` example:
```python
# [TLSMode: require]
import vertica_python
conn_info = {'host': '127.0.0.1',
             'user': 'some_user',
             'database': 'a_database',
             'tlsmode': 'require'}
connection = vertica_python.connect(**conn_info)
```
```python
# [TLSMode: verify-ca]
import vertica_python
conn_info = {'host': '127.0.0.1',
             'user': 'some_user',
             'database': 'a_database',
             'tlsmode': 'verify-ca',
             'tls_cafile': '/path/to/ca_file.pem' # CA certificate used to verify server certificate
            }
connection = vertica_python.connect(**conn_info)
```

```python
# [TLSMode: verify-full] + Mutual Mode
import vertica_python

conn_info = {'host': '127.0.0.1',
             'user': 'some_user',
             'database': 'a_database',
             'tlsmode': 'verify-full',
             'tls_cafile' = '/path/to/ca_file.pem'  # CA certificate used to verify server certificate
             'tls_certfile' = '/path/to/client.pem',  # (for mutual mode) client certificate
             'tls_keyfile' = '/path/to/client.key'  # (for mutual mode) client private key
            }
connection = vertica_python.connect(**conn_info)
```

You can pass an `ssl.SSLContext` object to `ssl` to customize the underlying SSL connection options. See more on SSL options [here](https://docs.python.org/3/library/ssl.html).

Server mode TLS examples:

```python
import vertica_python
import ssl

# [TLSMode: require]
# Ensure connection is encrypted.
ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT)
ssl_context.check_hostname = False
ssl_context.verify_mode = ssl.CERT_NONE

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'database': 'a_database',
             'ssl': ssl_context}
connection = vertica_python.connect(**conn_info)


# [TLSMode: verify-ca]
# Ensure connection is encrypted, and client trusts server certificate.
ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT)
ssl_context.verify_mode = ssl.CERT_REQUIRED
ssl_context.check_hostname = False
ssl_context.load_verify_locations(cafile='/path/to/ca_file.pem') # CA certificate used to verify server certificate

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'database': 'a_database',
             'ssl': ssl_context}
connection = vertica_python.connect(**conn_info)


# [TLSMode: verify-full]
# Ensure connection is encrypted, client trusts server certificate,
# and server hostname matches the one listed in the server certificate.
ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT)
ssl_context.verify_mode = ssl.CERT_REQUIRED
ssl_context.check_hostname = True
ssl_context.load_verify_locations(cafile='/path/to/ca_file.pem') # CA certificate used to verify server certificate

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'database': 'a_database',
             'ssl': ssl_context}
connection = vertica_python.connect(**conn_info)
```

Mutual mode TLS example:
```python
import vertica_python
import ssl

# [TLSMode: verify-full]
# Ensure connection is encrypted, client trusts server certificate,
# and server hostname matches the one listed in the server certificate.
ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT)
ssl_context.verify_mode = ssl.CERT_REQUIRED
ssl_context.check_hostname = True
ssl_context.load_verify_locations(cafile='/path/to/ca_file.pem') # CA certificate used to verify server certificate

# For Mutual mode, provide client certificate and client private key to ssl_context.
# CA certificate used to verify client certificate should be set at the server side.
ssl_context.load_cert_chain(certfile='/path/to/client.pem', keyfile='/path/to/client.key')

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'database': 'a_database',
             'ssl': ssl_context}
connection = vertica_python.connect(**conn_info)
```



#### Kerberos Authentication
In order to use Kerberos authentication, install [dependencies](#using-kerberos-authentication) first, and it is the user's responsibility to ensure that an Ticket-Granting Ticket (TGT) is available and valid. Whether a TGT is available can be easily determined by running the `klist` command. If no TGT is available, then it first must be obtained by running the `kinit` command or by logging in. You can pass in optional arguments to customize the authentication. The arguments are `kerberos_service_name`, which defaults to "vertica", and `kerberos_host_name`, which defaults to the value of argument `host`. For example,

```python
import vertica_python

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             # The service name portion of the Vertica Kerberos principal
             'kerberos_service_name': 'vertica_krb',
             # The instance or host name portion of the Vertica Kerberos principal
             'kerberos_host_name': 'vcluster.example.com'}

with vertica_python.connect(**conn_info) as conn:
    # do things
```

#### OAuth Authentication
To authenticate via OAuth, provide an `oauth_access_token` that authorizes a user to the database.
```python
import vertica_python

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'database': 'a_database',
             # valid OAuth access token
             'oauth_access_token': 'xxxxxx'}

with vertica_python.connect(**conn_info) as conn:
    # do things
```

#### Logging
Logging is disabled by default if neither ```log_level``` or ```log_path``` are set. Passing value to at least one of those options to enable logging.

When logging is enabled, the default value of ```log_level``` is _logging.WARNING_. You can find all levels [here](https://docs.python.org/3/library/logging.html#logging-levels). And the default value of ```log_path``` is 'vertica_python.log', the log file will be in the current execution directory. If ```log_path``` is set to ```''``` (empty string) or ```None```, no file handler is set, logs will be processed by root handlers. For example,

```python
import vertica_python
import logging

## Example 1: write DEBUG level logs to './vertica_python.log'
conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             'log_level': logging.DEBUG}
with vertica_python.connect(**conn_info) as connection:
    # do things

## Example 2: write WARNING level logs to './path/to/logs/client.log'
conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             'log_path': 'path/to/logs/client.log'}
with vertica_python.connect(**conn_info) as connection:
   # do things

## Example 3: write INFO level logs to '/home/admin/logs/vClient.log'
conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             'log_level': logging.INFO,
             'log_path': '/home/admin/logs/vClient.log'}
with vertica_python.connect(**conn_info) as connection:
   # do things

## Example 4: use root handlers to process logs by setting 'log_path' to '' (empty string) 
conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             'log_level': logging.DEBUG,
             'log_path': ''}
with vertica_python.connect(**conn_info) as connection:
    # do things
```

#### Connection Failover
Supply a list of backup hosts to ```backup_server_node``` for the client to try if the primary host you specify in the connection parameters (```host```, ```port```) is unreachable. Each item in the list should be either a host string (using default port 5433) or a (host, port) tuple. A host can be a host name or an IP address.

```python
import vertica_python

conn_info = {'host': 'unreachable.server.com',
             'port': 888,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             'backup_server_node': ['123.456.789.123', 'invalid.com', ('10.20.82.77', 6000)]}
connection = vertica_python.connect(**conn_info)
```

#### Connection Load Balancing
Connection Load Balancing helps automatically spread the overhead caused by client connections across the cluster by having hosts redirect client connections to other hosts. Both the server and the client need to enable load balancing for it to function. If the server disables connection load balancing, the load balancing request from client will be ignored.

```python
import vertica_python

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'vdb',
             'connection_load_balance': True}

# Server enables load balancing
with vertica_python.connect(**conn_info) as conn:
    cur = conn.cursor()
    cur.execute("SELECT NODE_NAME FROM V_MONITOR.CURRENT_SESSION")
    print("Client connects to primary node:", cur.fetchone()[0])
    cur.execute("SELECT SET_LOAD_BALANCE_POLICY('ROUNDROBIN')")

with vertica_python.connect(**conn_info) as conn:
    cur = conn.cursor()
    cur.execute("SELECT NODE_NAME FROM V_MONITOR.CURRENT_SESSION")
    print("Client redirects to node:", cur.fetchone()[0])

## Output
#  Client connects to primary node: v_vdb_node0003
#  Client redirects to node: v_vdb_node0005
```

#### Data Transfer Format
There are two formats for transferring data from a server to a vertica-python client: text and binary. For example, a FLOAT type data is represented as a 8-byte IEEE-754 floating point number (fixed-width) in binary format, and a human-readable string (variable-width) in text format. The text format of values is whatever strings are produced and accepted by the input/output conversion functions for the particular data type.

Depending on the data type, binary transfer is generally more efficient and requires less bandwidth than text transfer. However, when transferring a large number of small values, binary transfer may use more bandwidth.

A connection is set to use text format by default. Set ```binary_transfer``` to True to use binary format.

```python
import vertica_python

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'vdb',
             'binary_transfer': True  # False by default
             }

# Server enables binary transfer
with vertica_python.connect(**conn_info) as conn:
    cur = conn.cursor()
    ...
```

Ideally, the [output data](#sql-data-conversion-to-python-objects) should be the same for these two formats. However, there are edge cases:
- FLOAT data: binary format might offer slightly greater precision than text format. E.g. `select ATAN(12.345)` returns 1.48996835348642 (text) or 1.489968353486419 (binary)
- TIMESTAMPTZ data: text format always use the session timezone, but binary format might fail to get session timezone and use local timezone.
- NUMERIC data: In old server versions, the precision and scale is incorrect when querying a NUMERIC column that is not from a specific table with prepared statement in binary format. E.g. `select ?::NUMERIC` or `select node_id, ?/50 from nodes`. In newer server versions, binary transfer is forcibly disabled for NUMERIC data by the server, regardless of client-side values of ```binary_transfer``` and ```use_prepared_statements```.



### Send Queries and Retrieve Results
The `Connection` class encapsulates a database session. It allows to:
- create new `Cursor` instances using the `cursor()` method to execute database commands and queries.
- [terminate transactions](#insert-and-commitrollback) using the methods `commit()` or `rollback()`.

The class `Cursor` allows interaction with the database:
- send commands to the database using methods such as `execute()`, `executemany()` and [copy](#using-copy-from).
- retrieve data from the database, [iterating on the cursor](#stream-query-results) or using methods such as `fetchone()`, `fetchmany()`, `fetchall()`, `nextset()`.

```python
import vertica_python

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'vdb'}

# Connect to a vertica database
with vertica_python.connect(**conn_info) as conn:
    # Open a cursor to perform database operations
    # vertica-python only support one cursor per connection
    cur = conn.cursor()
    
    # Execute a command: create a table
    cur.execute("CREATE TABLE tbl (a INT, b VARCHAR)")
    
    # Insert a row
    cur.execute("INSERT INTO tbl VALUES (1, 'aa')")
    inserted = cur.fetchall()  # [[1]]
    
    # Bulk Insert with executemany()
    # Pass data to fill a query placeholders and let vertica-python perform the correct conversion
    cur.executemany("INSERT INTO tbl(a, b) VALUES (?, ?)", [(2, 'bb'), (3, 'foo'), (4, 'xx'), (5, 'bar')], use_prepared_statements=True)
    # OR
    # cur.executemany("INSERT INTO tbl(a, b) VALUES (%s, %s)", [(6, 'bb'), (7, 'foo'), (8, 'xx'), (9, 'bar')], use_prepared_statements=False)
    
    # Query the database and obtain data as Python objects.
    cur.execute("SELECT * FROM tbl")
    datarow = cur.fetchone()         # [1, 'aa']
    remaining_rows = cur.fetchall()  # [[2, 'bb'], [3, 'foo'], [4, 'xx'], [5, 'bar']]

    # Make the changes to the database persistent
    conn.commit()
    
    # Execute a query with MULTIPLE statements
    cur.execute("SELECT 1; SELECT 2; ...; SELECT N")
    while True:  # Fetch the result set for each statement
        rows = cur.fetchall()
        print(rows)
        if not cur.nextset():
            break
    # Output:        
    # [[1]]
    # [[2]]
    # ...
    # [[N]]

```
#### Frequently Asked Questions :speech_balloon:
<details>
<summary>Why does my query return empty results?</summary>
  If you think <code>Cursor.fetch*()</code> should return something, check whether your query contains multiple statements. It is very likely that you miss to call <code>Cursor.nextset()</code>.
</details>

<details>
<summary>Why does my query not throw an error?</summary>
  vertica-python tries to throw exceptions in the <code>Cursor.execute()</code> method, but depending on your query, there are some exceptions that can only be raised when you call <code>fetchone()</code> <code>fetchmany()</code> or <code>fetchall()</code>. In addition, if your query has multiple statements, errors that is not in the first statement cannot be thrown by <code>execute()</code>. It is recommended to always call <code>fetchall()</code> after <code>execute()</code> in order to capture any error. And for a query with multiple statements, call <code>fetchall()</code> and <code>nextset()</code> as the above example code shows.
</details>

<details>
<summary>Why is this client N times slower than another vertica client?</summary>
  You may find vertica-python performs much slower executing same query on same machine than another python client (e.g. pyodbc) or client in other programming language. This is because vertica-python is a pure Python program and CPython (the official implementation of Python, which is an interpreted, dynamic language) computation is often many times slower than compiled languages like C and Go, or JIT (Just-in-Time) compiled languages like Java and JavaScript. Therefore, if you want to get better performance, instead of using the official CPython interpreter, try other performance-oriented interpreters such as PyPy.
</details>

### Stream query results
Streaming is recommended if you want to further process each row, save the results in a non-list/dict format (e.g. Pandas DataFrame), or save the results in a file.

```python
cur = connection.cursor()
cur.execute("SELECT * FROM a_table LIMIT 2")

for row in cur.iterate():
    print(row)
# [ 1, 'some text', datetime.datetime(2014, 5, 18, 6, 47, 1, 928014) ]
# [ 2, 'something else', None ]

```


### In-memory results as list

```python
cur = connection.cursor()
cur.execute("SELECT * FROM a_table LIMIT 2")
cur.fetchall()
# [ [1, 'something'], [2, 'something_else'] ]
```


### In-memory results as dictionary

```python
cur = connection.cursor('dict')
cur.execute("SELECT * FROM a_table LIMIT 2")
cur.fetchall()
# [ {'id': 1, 'value': 'something'}, {'id': 2, 'value': 'something_else'} ]
connection.close()
```

### Nextset

If you execute multiple statements in a single call to execute(), you can use `Cursor.nextset()` to retrieve all of the data.

```python
cur.execute('SELECT 1; SELECT 2;')

cur.fetchone()
# [1]
cur.fetchone()
# None

cur.nextset()
# True

cur.fetchone()
# [2]
cur.fetchone()
# None

cur.nextset()
# False
```

### Passing parameters to SQL queries

vertica-python provides two methods for passing parameters to a SQL query:
1. [Server-side binding](#server-side-binding-query-using-prepared-statements)
2. [Client-side binding](#client-side-binding-query-using-named-parameters-or-format-parameters)

:warning: Prerequisites: Only SQL literals (i.e. query values) should be bound via these methods: they shouldn’t be used to merge table or field names to the query (_vertica-python_ will try quoting the table name as a string value, generating invalid SQL as it is actually a SQL Identifier). If you need to generate dynamically SQL queries (for instance choosing dynamically a table name) you have to construct the full query yourself.

#### Server-side binding: Query using prepared statements

Vertica server-side prepared statements let you define a statement once and then run it many times with different parameters. Internally, vertica-python sends the query and the parameters to the server separately. Placeholders in the statement are represented by question marks (?). Server-side prepared statements are useful for preventing SQL injection attacks.

```python
import vertica_python

# Enable using server-side prepared statements at connection level
conn_info = {'host': '127.0.0.1',
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             'use_prepared_statements': True,
             }

with vertica_python.connect(**conn_info) as connection:
    cur = connection.cursor()
    cur.execute("CREATE TABLE tbl (a INT, b VARCHAR)")
    cur.execute("INSERT INTO tbl VALUES (?, ?)", [1, 'aa'])
    cur.execute("INSERT INTO tbl VALUES (?, ?)", [2, 'bb'])
    cur.executemany("INSERT INTO tbl VALUES (?, ?)", [(3, 'foo'), (4, 'xx'), (5, 'bar')])
    cur.execute("COMMIT")

    cur.execute("SELECT * FROM tbl WHERE a>=? AND a<=? ORDER BY a", (2,4))
    cur.fetchall()
    # [[2, 'bb'], [3, 'foo'], [4, 'xx']]
```

:no_entry_sign: Vertica server-side prepared statements does not support executing a query string containing multiple statements.

You can set ```use_prepared_statements``` option in ```cursor.execute*()``` functions to override the connection level setting.

```python
import vertica_python

# Enable using server-side prepared statements at connection level
conn_info = {'host': '127.0.0.1',
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             'use_prepared_statements': True,
             }

with vertica_python.connect(**conn_info) as connection:
    cur = connection.cursor()
    cur.execute("CREATE TABLE tbl (a INT, b VARCHAR)")

    # Executing compound statements
    cur.execute("INSERT INTO tbl VALUES (?, ?); COMMIT", [1, 'aa'])
    # Error message: Cannot insert multiple commands into a prepared statement

    # Disable prepared statements but forget to change placeholders (?)
    cur.execute("INSERT INTO tbl VALUES (?, ?); COMMIT;", [1, 'aa'], use_prepared_statements=False)
    # TypeError: not all arguments converted during string formatting

    cur.execute("INSERT INTO tbl VALUES (%s, %s); COMMIT;", [1, 'aa'], use_prepared_statements=False)
    cur.execute("INSERT INTO tbl VALUES (:a, :b); COMMIT;", {'a': 2, 'b': 'bb'}, use_prepared_statements=False)

# Disable using server-side prepared statements at connection level
conn_info['use_prepared_statements'] = False
with vertica_python.connect(**conn_info) as connection:
    cur = connection.cursor()

    # Try using prepared statements
    cur.execute("INSERT INTO tbl VALUES (?, ?)", [3, 'foo'])
    # TypeError: not all arguments converted during string formatting

    cur.execute("INSERT INTO tbl VALUES (?, ?)", [3, 'foo'], use_prepared_statements=True)

    # Query using named parameters
    cur.execute("SELECT * FROM tbl WHERE a>=:n1 AND a<=:n2 ORDER BY a", {'n1': 2, 'n2': 4})
    cur.fetchall()
    # [[2, 'bb'], [3, 'foo']]
```
Note: In other drivers, the batch insert is converted into a COPY statement by using prepared statements. vertica-python currently does not support that. [More details](#cursorexecutemany-server-side-binding-vs-client-side-binding)

#### Client-side binding: Query using named parameters or format parameters

vertica-python can automatically convert Python objects to SQL literals, merge the query and the parameters on the client side, and then send the query to the server: using this feature your code will be more robust and reliable to prevent SQL injection attacks. You need to set ```use_prepared_statements``` option to __False__ (at connection level or in cursor.execute*()) to use client-side binding.

Variables can be specified with named (__:name__) placeholders.
```python
cur = connection.cursor()
data = {'propA': 1, 'propB': 'stringValue'}
cur.execute("SELECT * FROM a_table WHERE a = :propA AND b = :propB", data, use_prepared_statements=False)
# converted into a SQL command similar to: SELECT * FROM a_table WHERE a = 1 AND b = 'stringValue'

cur.fetchall()
# [ [1, 'stringValue'] ]
```

Variables can also be specified with positional format (__%s__) placeholders. The placeholder __must always be a %s__, even if a different placeholder (such as a `%d` for integers or `%f` for floats) may look more appropriate. __Never__ use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string.
```python
cur = connection.cursor()
data = (1, "O'Reilly")
cur.execute("SELECT * FROM a_table WHERE a = %s AND b = %s" % data) # WRONG: % operator
cur.execute("SELECT * FROM a_table WHERE a = %d AND b = %s", data)  # WRONG: %d placeholder
cur.execute("SELECT * FROM a_table WHERE a = %s AND b = %s", data)  # correct
# converted into a SQL command: SELECT * FROM a_table WHERE a = 1 AND b = 'O''Reilly'

cur.fetchall()
# [ [1, "O'Reilly"] ]
```

The placeholder must not be quoted. _vertica-python_ will add quotes where needed.
```python
>>> cur.execute("INSERT INTO table VALUES (':propA')", {'propA': "someString"}) # WRONG
>>> cur.execute("INSERT INTO table VALUES (:propA)", {'propA': "someString"})   # correct
>>> cur.execute("INSERT INTO table VALUES ('%s')", ("someString",)) # WRONG
>>> cur.execute("INSERT INTO table VALUES (%s)", ("someString",))   # correct
```

In order to merge the query (with placeholders) and the parameters on the client side, parameter values (python object) are converted to SQL literals (str). _vertica-python_ supports default mapping to SQL literals for many standard Python types (str, bytes, bool, int, float, decimal.Decimal, tuple, list, set, dict, datetime.datetime, datetime.date, datetime.time, uuid.UUID). For complex types, in some cases, you may need explicit typecasting for the placeholder (e.g. `%s::ARRAY[ARRAY[INT]]`, `%s::ROW(varchar,int,date)`):

```python
from datetime import date
cur.execute("CREATE TABLE table (a INT, b ARRAY[DATE])")
value = [date(2021, 6, 10), date(2021, 6, 12), date(2021, 6, 30)]

cur.execute("INSERT INTO table VALUES (%s, %s)", [100, value], use_prepared_statements=False)  # WRONG
# Error Message: Column "b" is of type array[date] but expression is of type array[varchar], Sqlstate: 42804, 
# Hint: You will need to rewrite or cast the expression

cur.execute("INSERT INTO table VALUES (%s, %s::ARRAY[DATE])", [100, value], use_prepared_statements=False)  # correct
# converted into a SQL command: INSERT INTO vptest VALUES (100, ARRAY['2021-06-10','2021-06-12','2021-06-30']::ARRAY[DATE])

# Client-side binding of cursor.executemany is different from cursor.execute internally
# But it also supports some of complex types mapping
cur.executemany("INSERT INTO table (a, b) VALUES (%s, %s)", [[100, value]], use_prepared_statements=False)
```

##### Register new SQL literal adapters
It is possible to adapt new Python types to SQL literals via `Cursor.register_sql_literal_adapter(py_class_or_type, adapter_function)` function. Example:
```python
class Point(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y

# Adapter should return a string value
def adapt_point(point):
    return "STV_GeometryPoint({},{})".format(point.x, point.y)

cur = conn.cursor()
cur.register_sql_literal_adapter(Point, adapt_point)

cur.execute("INSERT INTO geom_data (geom) VALUES (%s)", [Point(1.23, 4.56)])
cur.execute("select ST_asText(geom) from geom_data")
cur.fetchall()
# [['POINT (1.23 4.56)']]
```

To help you debug the binding process during Cursor.execute*(), `Cursor.object_to_sql_literal(py_object)` function can be used to inspect the SQL literal string converted from a Python object.
```python
cur = conn.cursor
cur.object_to_sql_literal("O'Reilly")  # "'O''Reilly'"
cur.object_to_sql_literal(None)  # "NULL"
cur.object_to_sql_literal(True)  # "True"
cur.object_to_sql_literal(Decimal("10.00000"))  # "10.00000"
cur.object_to_sql_literal(datetime.date(2018, 9, 7))  # "'2018-09-07'"
cur.object_to_sql_literal(Point(-71.13, 42.36))  # "STV_GeometryPoint(-71.13,42.36)" if you registered in previous step
```

#### Cursor.executemany(): Server-side binding vs Client-side binding
```
Cursor.executemany(query, seq_of_parameters, use_prepared_statements=None)
```
Execute the same query or command with a sequence of input data.

PARAMETERS

 - query (str or bytes) – The query to execute.
 - seq_of_parameters (a list/tuple of Sequences or Mappings) – The parameters to pass to the query.
 - use_prepared_statements (bool) – Use connection level setting by default. If set, execute the query using server-side prepared statements or not.

When `use_prepared_statements=True` (Server-side binding), the query should contain only a single statement. Internally, vertica-python sends the query and each set of parameters to the server separately.

When `use_prepared_statements=False` (Client-side binding), the query is limited to simple INSERT statements only. The batch insert is converted into a COPY FROM  STDIN statement by the client. This is more efficient than performing separate queries (may even faster than Server-side binding), but in case of other statements you may consider using [copy](#using-copy-from).

```python
# Note the query parameter placeholders difference!
cur.executemany("INSERT INTO tbl(a, b) VALUES (?, ?)", [(2, 'bb'), (3, 'foo'), (4, 'xx'), (5, 'bar')], use_prepared_statements=True)

cur.executemany("INSERT INTO tbl(a, b) VALUES (%s, %s)", [(6, 'bb'), (7, 'foo'), (8, 'xx'), (9, 'bar')], use_prepared_statements=False)
cur.executemany("INSERT INTO tbl(a, b) VALUES (:a, :b)", [{'a': 2, 'b': 'bb'}, {'a': 3, 'b': 'foo'}], use_prepared_statements=False)
```


### Insert and commit/rollback

```python
cur = connection.cursor()

# inline commit (when 'use_prepared_statements' is False)
cur.execute("INSERT INTO a_table (a, b) VALUES (1, 'aa'); commit;")

# commit in execution
cur.execute("INSERT INTO a_table (a, b) VALUES (1, 'aa')")
cur.execute("INSERT INTO a_table (a, b) VALUES (2, 'bb')")
cur.execute("commit;")

# connection.commit()
cur.execute("INSERT INTO a_table (a, b) VALUES (1, 'aa')")
connection.commit()

# connection.rollback()
cur.execute("INSERT INTO a_table (a, b) VALUES (0, 'bad')")
connection.rollback()
```

### Autocommit

Session parameter AUTOCOMMIT can be configured by the connection option and the `Connection.autocommit` read/write attribute:
```python
import vertica_python

# Enable autocommit at startup
conn_info = {'host': '127.0.0.1',
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             # autocommit is off by default
             'autocommit': True,
             }
             
with vertica_python.connect(**conn_info) as connection:
    # Check current session autocommit setting
    print(connection.autocommit)    # should be True
    # If autocommit is True, statements automatically commit their transactions when they complete.
    
    # Set autocommit setting with attribute
    connection.autocommit = False
    print(connection.autocommit)    # should be False
    # If autocommit is False, the methods commit() or rollback() must be manually invoked to terminate the transaction.
```

To set AUTOCOMMIT to a new value, vertica-python uses `Cursor.execute()` to execute a command internally, and that would clear your previous query results, so be sure to call `Cursor.fetch*()` to save your results before you set autocommit.


### Using COPY FROM

:warning: Prerequisites: If the data is in files not STDIN, only files on the client system should be loaded via these methods below. For files on the server system, run "COPY target-table FROM 'path‑to‑data'" with `Cursor.execute()`.

There are 2 methods to do copy:

#### Method 1: "COPY FROM STDIN" sql with Cursor.copy()
```python
cur = connection.cursor()
cur.copy("COPY test_copy (id, name) FROM stdin DELIMITER ',' ",  csv)
```

Where `csv` is either a string or a file-like object (specifically, any object with a `read()` method). If using a file, the data is streamed (in chunks of `buffer_size` bytes, which defaults to 128 * 2 ** 10).

```python
with open("/tmp/binary_file.csv", "rb") as fs:
    cursor.copy("COPY table(field1, field2) FROM STDIN DELIMITER ',' ENCLOSED BY '\"'",
                fs, buffer_size=65536)
```

#### Method 2: "COPY FROM LOCAL" sql with Cursor.execute() 

```python
import sys
import vertica_python

conn_info = {'host': '127.0.0.1',
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             # False by default
             #'disable_copy_local': True,
             # Don't support executing COPY LOCAL operations with prepared statements
             'use_prepared_statements': False
             }

with vertica_python.connect(**conn_info) as connection:
    cur = connection.cursor()
    
    # Copy from local file
    cur.execute("COPY table(field1, field2) FROM LOCAL"
                " 'data_Jan_*.csv','data_Feb_01.csv' DELIMITER ','"
                " REJECTED DATA 'path/to/write/rejects.txt'"
                " EXCEPTIONS 'path/to/write/exceptions.txt'",
                buffer_size=65536
    )
    print("Rows loaded:", cur.fetchall())
    
    # Copy from local stdin
    cur.execute("COPY table(field1, field2) FROM LOCAL STDIN DELIMITER ','", copy_stdin=sys.stdin)
    print("Rows loaded:", cur.fetchall())

    # Copy from local stdin (compound statements)
    with open('f1.csv', 'r') as fs1, open('f2.csv', 'r') as fs2:
        cur.execute("COPY tlb1(field1, field2) FROM LOCAL STDIN DELIMITER ',';"
                    "COPY tlb2(field1, field2) FROM LOCAL STDIN DELIMITER ',';",
                    copy_stdin=[fs1, fs2], buffer_size=65536)
    print("Rows loaded 1:", cur.fetchall())
    cur.nextset()
    print("Rows loaded 2:", cur.fetchall())
    
    # Copy from local stdin (StringIO)
    from io import StringIO
    data = "Anna|123-456-789\nBrown|555-444-3333\nCindy|555-867-53093453453\nDodd|123-456-789\nEd|123-456-789"
    cur.execute("COPY customers (firstNames, phoneNumbers) FROM LOCAL STDIN ENFORCELENGTH RETURNREJECTED AUTO",
                copy_stdin=StringIO(data))

```
When connection option `disable_copy_local` set to True, disables COPY LOCAL operations, including copying data from local files/stdin and using local files to store data and exceptions. You can use this property to prevent users from writing to and copying from files on a Vertica host, including an MC host. Note that this property doesn't apply to `Cursor.copy()`.

The data for copying from/writing to local files is streamed in chunks of `buffer_size` bytes, which defaults to 128 * 2 ** 10.

When executing "COPY FROM LOCAL STDIN", `copy_stdin` should be a file-like object or a list of file-like objects (specifically, any object with a `read()` method).

### Cancel the current database operation

`Connection.cancel()` interrupts the processing of the current operation. Interrupting query execution will cause the cancelled method to raise a `vertica_python.errors.QueryCanceled`. If no query is being executed, it does nothing. You can call this function from a different thread/process than the one currently executing a database operation.

```python
from multiprocessing import Process
import time
import vertica_python

def cancel_query(connection, timeout=5):
    time.sleep(timeout)
    connection.cancel()

# Example 1: Cancel the query before Cursor.execute() return.
#            The query stops executing in a shorter time after the cancel message is sent.
with vertica_python.connect(**conn_info) as conn:
    cur = conn.cursor()

    # Call cancel() from a different process
    p1 = Process(target=cancel_query, args=(conn,))
    p1.start()

    try:
        cur.execute("<Long running query>")
    except vertica_python.errors.QueryCanceled as e:
        pass
    p1.join()
    
# Example 2: Cancel the query after Cursor.execute() return.
#            Less number of rows read after the cancel message is sent.
with vertica_python.connect(**conn_info) as conn:
    cur = conn.cursor()
    cur.execute("SELECT id, time FROM large_table")
    nCount = 0
    try:
        while cur.fetchone():
            nCount += 1
            if nCount == 100:
                conn.cancel()
    except vertica_python.errors.QueryCanceled as e:
        pass
        # nCount is less than the number of rows in large_table
```

### SQL Data conversion to Python objects
When a query is executed and `Cursor.fetch*()` is called, SQL data (bytes) are deserialized as Python objects. The following table shows the default mapping from SQL data types to Python objects:

| SQL data type  | Python object type |
| -------------- | ------------------ |
| NULL           | None               |
| BOOLEAN        | bool               |
| INTEGER        | int                |
| FLOAT          | float              |
| NUMERIC        | [decimal.Decimal](https://docs.python.org/3/library/decimal.html#decimal.Decimal) |
| CHAR           | str                |
| VARCHAR        | str                |
| LONG VARCHAR   | str                |
| BINARY         | bytes              |
| VARBINARY      | bytes              |
| LONG VARBINARY | bytes              |
| UUID           | [uuid.UUID](https://docs.python.org/3/library/uuid.html#uuid.UUID) |
| DATE           | datetime.date<sup>[1]</sup> |
| TIME           | datetime.time<sup>[2]</sup> |
| TIMETZ         | datetime.time<sup>[2]</sup> |
| TIMESTAMP      | datetime.datetime<sup>[1]</sup> |
| TIMESTAMPTZ    | datetime.datetime<sup>[1]</sup> |
| INTERVAL	     | [dateutil.relativedelta.relativedelta](https://dateutil.readthedocs.io/en/stable/relativedelta.html#dateutil.relativedelta.relativedelta) |
| ARRAY          | list<sup>[3]</sup> |
| SET            | set<sup>[3]</sup>  |
| ROW            | dict<sup>[3]</sup> |
| MAP            | dict<sup>[3]</sup> |

<sup>[1]</sup>Python’s datetime.date and datetime.datetime only supports date ranges 0001-01-01 to 9999-12-31. Retrieving a value of BC date or future date (year>9999) results in an error.

<sup>[2]</sup>Python’s datetime.time only supports times until 23:59:59. Retrieving a value of 24:00:00 results in an error.

<sup>[3]</sup>If connection option 'request_complex_types' set to _False_, the server returns all complex types as VARCHAR/LONG VARCHAR Json strings, so the client will convert data to _str_ instead. Server before v12.0.2 cannot provide enough metadata for complex types, the behavior is equal to request_complex_types=False.


#### Bypass data conversion to Python objects

The `Cursor.disable_sqldata_converter` attribute can bypass the result data conversion to Python objects.

```python
with vertica_python.connect(**conn_info) as conn:
    cur = conn.cursor()
    sql = "select 'foo'::VARCHAR, 100::INT, '2001-12-01 02:50:00'::TIMESTAMP"
    
    #### Convert SQL types to Python objects ####
    print(cur.disable_sqldata_converter)   # Default is False
    # False
    cur.execute(sql)
    print(cur.fetchall())
    # [['foo', 100, datetime.datetime(2001, 12, 1, 2, 50)]]
    
    #### No Conversion: return raw bytes data ####
    cur.disable_sqldata_converter = True   # Set attribute to True
    cur.execute(sql)
    print(cur.fetchall())
    # [[b'foo', b'100', b'2001-12-01 02:50:00']]
```

As a result, this can improve query performance when you call `fetchall()` but ignore/skip result data.

#### Customize data conversion to Python objects
The `Cursor.register_sqldata_converter(oid, converter_func)` method allows to customize how SQL data values are converted to Python objects when query results are returned.

PARAMETERS:
- oid – The Vertica type OID to manage.
- converter_func – The converter function to register for oid. The function should have two arguments <`val`, `ctx`>. [Data Transfer Format](#data-transfer-format) matters for `val` (SQL data value). `ctx` is a dict managing resources that may be used by convertions. E.g. `ctx['column'].format_code` would be 0 (Text transfer format) or 1 (Binary transfer format).

The `Cursor.unregister_sqldata_converter(oid)` method allows to cancel customization and use the default converter.

Each Vertica type OID is an integer representing a SQL type, you can look up OIDs in `vertica_python.datatypes`:
```
$ python3
>>> from vertica_python.datatypes import VerticaType
>>> {k:v for k,v in dict(VerticaType.__dict__).items() if not k.startswith('_')}
{'UNKNOWN': 4, 'BOOL': 5, 'INT8': 6, 'FLOAT8': 7, 'CHAR': 8, 'VARCHAR': 9, 'DATE': 10, 'TIME': 11, 'TIMESTAMP': 12, 'TIMESTAMPTZ': 13, 'INTERVAL': 14, 'INTERVALYM': 114, 'TIMETZ': 15, 'NUMERIC': 16, 'VARBINARY': 17, 'UUID': 20, 'LONGVARCHAR': 115, 'LONGVARBINARY': 116, 'BINARY': 117, 'ROW': 300, 'ARRAY': 301, 'MAP': 302, 'ARRAY1D_BOOL': 1505, 'ARRAY1D_INT8': 1506, 'ARRAY1D_FLOAT8': 1507, 'ARRAY1D_CHAR': 1508, 'ARRAY1D_VARCHAR': 1509, 'ARRAY1D_DATE': 1510, 'ARRAY1D_TIME': 1511, 'ARRAY1D_TIMESTAMP': 1512, 'ARRAY1D_TIMESTAMPTZ': 1513, 'ARRAY1D_INTERVAL': 1514, 'ARRAY1D_INTERVALYM': 1521, 'ARRAY1D_TIMETZ': 1515, 'ARRAY1D_NUMERIC': 1516, 'ARRAY1D_VARBINARY': 1517, 'ARRAY1D_UUID': 1520, 'ARRAY1D_BINARY': 1522, 'ARRAY1D_LONGVARCHAR': 1519, 'ARRAY1D_LONGVARBINARY': 1518, 'SET_BOOL': 2705, 'SET_INT8': 2706, 'SET_FLOAT8': 2707, 'SET_CHAR': 2708, 'SET_VARCHAR': 2709, 'SET_DATE': 2710, 'SET_TIME': 2711, 'SET_TIMESTAMP': 2712, 'SET_TIMESTAMPTZ': 2713, 'SET_INTERVAL': 2714, 'SET_INTERVALYM': 2721, 'SET_TIMETZ': 2715, 'SET_NUMERIC': 2716, 'SET_VARBINARY': 2717, 'SET_UUID': 2720, 'SET_BINARY': 2722, 'SET_LONGVARCHAR': 2719, 'SET_LONGVARBINARY': 2718}
>>>
>>> # Use VerticaType.XXXX to refer to an OID
>>> VerticaType.VARCHAR
9
>>>
>>> from vertica_python.datatypes import TYPENAME
>>> TYPENAME   # mapping from OIDs to readable names
{4: 'Unknown', 5: 'Boolean', 6: 'Integer', 7: 'Float', 8: 'Char', 9: 'Varchar', 115: 'Long Varchar', 10: 'Date', 11: 'Time', 15: 'TimeTz', 12: 'Timestamp', 13: 'TimestampTz', 117: 'Binary', 17: 'Varbinary', 116: 'Long Varbinary', 16: 'Numeric', 20: 'Uuid', 300: 'Row', 301: 'Array', 302: 'Map', 1505: 'Array[Boolean]', 1506: 'Array[Int8]', 1507: 'Array[Float8]', 1508: 'Array[Char]', 1509: 'Array[Varchar]', 1510: 'Array[Date]', 1511: 'Array[Time]', 1512: 'Array[Timestamp]', 1513: 'Array[TimestampTz]', 1515: 'Array[TimeTz]', 1516: 'Array[Numeric]', 1517: 'Array[Varbinary]', 1520: 'Array[Uuid]', 1522: 'Array[Binary]', 1519: 'Array[Long Varchar]', 1518: 'Array[Long Varbinary]', 2705: 'Set[Boolean]', 2706: 'Set[Int8]', 2707: 'Set[Float8]', 2708: 'Set[Char]', 2709: 'Set[Varchar]', 2710: 'Set[Date]', 2711: 'Set[Time]', 2712: 'Set[Timestamp]', 2713: 'Set[TimestampTz]', 2715: 'Set[TimeTz]', 2716: 'Set[Numeric]', 2717: 'Set[Varbinary]', 2720: 'Set[Uuid]', 2722: 'Set[Binary]', 2719: 'Set[Long Varchar]', 2718: 'Set[Long Varbinary]'}
```

**Example: Vertica numeric to Python float**

Normally Vertica NUMERIC values are converted to Python decimal.Decimal instances, because both the types allow fixed-precision arithmetic and are not subject to rounding. Sometimes, however, you may want to perform floating-point math on NUMERIC values, and decimal.Decimal may get in the way. If you are fine with the potential loss of precision and you simply want to receive NUMERIC values as Python float, you can register a converter on NUMERIC.

```python
import vertica_python
from vertica_python.datatypes import VerticaType

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             ...
            }

conn = vertica_python.connect(**conn_info)
cur = conn.cursor()
cur.execute("SELECT 123.45::NUMERIC")
print(cur.fetchone()[0])
# Decimal('123.45')

def convert_numeric(val, ctx):
    # val: bytes - this is a text representation of NUMERIC value
    # ctx: dict - some info that may be useful to the converter
    return float(val)

cur.register_sqldata_converter(VerticaType.NUMERIC, convert_numeric)
cur.execute("SELECT 123.45::NUMERIC")
print(cur.fetchone()[0])
# 123.45

cur.unregister_sqldata_converter(VerticaType.NUMERIC)  # reset
cur.execute("SELECT 123.45::NUMERIC")
print(cur.fetchone()[0])
# Decimal('123.45')
```

**Example: Vertica complex types**

The raw bytes data of complex types (ARRAY, MAP, ROW, SET) are in JSON format for both Text & Binary transfer format.

```python
import json
import numpy as np
import vertica_python
from vertica_python.datatypes import VerticaType

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'binary_transfer': False/True,
             ...
            }

conn = vertica_python.connect(**conn_info)
cur = conn.cursor()

#==================================================
cur.execute("SELECT ARRAY[-1.234, 0, 1.66, null, 50]::ARRAY[FLOAT]")
data = cur.fetchone()[0]
print(type(data))  # <class 'list'>
print(data)  # [-1.234, 0.0, 1.66, None, 50.0]
numpy_data = np.array(data) # This is equal to the query value below
#==================================================
def convert_array(val, ctx):
    # val: b'[-1.234,0.0,1.66,null,50.0]'
    json_data = json.loads(val)
    return np.array(json_data)
# VerticaType.ARRAY1D_FLOAT8 represents one-dimensional array of FLOAT type
cur.register_sqldata_converter(VerticaType.ARRAY1D_FLOAT8, convert_array)
cur.execute("SELECT ARRAY[-1.234, 0, 1.66, null, 50]::ARRAY[FLOAT]")
data = cur.fetchone()[0]
print(type(data))  # <class 'numpy.ndarray'>
print(data)  # [-1.234 0.0 1.66 None 50.0]

#==================================================
# VerticaType.ARRAY represents multidimensional array or contain ROWs
cur.register_sqldata_converter(VerticaType.ARRAY, convert_array)
cur.execute("SELECT ARRAY[ARRAY[-1, 234, 5],ARRAY[88, 0, 19]]::ARRAY[ARRAY[INT]]")
data = cur.fetchone()[0]
print(type(data))  # <class 'numpy.ndarray'>
print(data)
#[[ -1 234   5]
# [ 88   0  19]]
```
```python
import pandas
import vertica_python
from io import BytesIO
from vertica_python.datatypes import VerticaType

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'binary_transfer': False/True,
             ...
            }

conn = vertica_python.connect(**conn_info)
cur = conn.cursor()

cur.execute("SELECT ROW(ROW('a','b') as row1, ROW('c','d') as row2)")
data = cur.fetchone()[0]
print(type(data)) # <class 'dict'>
print(data)  # {'row1': {'f0': 'a', 'f1': 'b'}, 'row2': {'f0': 'c', 'f1': 'd'}}

def convert_row(val, ctx):
    # val: b'{"row1":{"f0":"a","f1":"b"},"row2":{"f0":"c","f1":"d"}}'
    return pandas.read_json(BytesIO(val), orient='index')

cur.register_sqldata_converter(VerticaType.ROW, convert_row)
cur.execute("SELECT ROW(ROW('a','b') as row1, ROW('c','d') as row2)")
data = cur.fetchone()[0]
print(type(data)) # <class 'pandas.core.frame.DataFrame'>
print(data)
#      f0 f1
# row1  a  b
# row2  c  d
```


If you want to learn how default converters for each transfer format and oid works, look at the source code at `vertica_python/vertica/deserializer.py`

### Shortcuts
The `Cursor.execute()` method returns `self`. This means that you can chain a fetch operation, such as `fetchone()`, to the `execute()` call:
```python
row = cursor.execute(...).fetchone()

for row in cur.execute(...).fetchall():
    ...
```

## Rowcount oddities

vertica_python behaves a bit differently than dbapi when returning rowcounts.

After a select execution, the rowcount will be -1, indicating that the row count is unknown. The rowcount value will be updated as data is streamed.

```python
cur.execute('SELECT 10 things')

cur.rowcount == -1  # indicates unknown rowcount

cur.fetchone()
cur.rowcount == 1
cur.fetchone()
cur.rowcount == 2
cur.fetchall()
cur.rowcount == 10
```

After an insert/update/delete, the rowcount will be returned as a single element row:

```python
cur.execute("DELETE 3 things")

cur.rowcount == -1  # indicates unknown rowcount
cur.fetchone()[0] == 3
```


## UTF-8 encoding issues

While Vertica expects varchars stored to be UTF-8 encoded, sometimes invalid strings get into the database. You can specify how to handle reading these characters using the unicode_error connection option. This uses the same values as the unicode type (https://docs.python.org/3/library/codecs.html#error-handlers)

```python
cur = vertica_python.Connection({..., 'unicode_error': 'strict'}).cursor()
cur.execute(r"SELECT E'\xC2'")
cur.fetchone()
# caught 'utf8' codec can't decode byte 0xc2 in position 0: unexpected end of data

cur = vertica_python.Connection({..., 'unicode_error': 'replace'}).cursor()
cur.execute(r"SELECT E'\xC2'")
cur.fetchone()
# �

cur = vertica_python.Connection({..., 'unicode_error': 'ignore'}).cursor()
cur.execute(r"SELECT E'\xC2'")
cur.fetchone()
# 
```

## License

Apache 2.0 License, please see `LICENSE` for details.

## Contributing guidelines

Have a bug or an idea? Please see `CONTRIBUTING.md` for details.

## Acknowledgements

We would like to thank the contributors to the Ruby Vertica gem (https://github.com/sprsquish/vertica), as this project gave us inspiration and help in understanding Vertica's wire protocol. These contributors are:

 * [Matt Bauer](http://github.com/mattbauer)
 * [Jeff Smick](http://github.com/sprsquish)
 * [Willem van Bergen](http://github.com/wvanbergen)
 * [Camilo Lopez](http://github.com/camilo)