File: nes_release_notes.txt

package info (click to toggle)
ofed-docs 1.4.2-1
  • links: PTS
  • area: main
  • in suites: squeeze, wheezy
  • size: 772 kB
  • ctags: 106
  • sloc: sh: 253; makefile: 32
file content (301 lines) | stat: -rw-r--r-- 9,330 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
            Open Fabrics Enterprise Distribution (OFED)
      NetEffect Ethernet Cluster Server Adapter Release Notes
                           May 2009



The iw_nes module and libnes user library provide RDMA and L2IF
support for the NetEffect Ethernet Cluster Server Adapters.


============================================
Required Setting - RDMA Unify TCP port space
============================================
RDMA connections use the same TCP port space as the host stack.  To avoid
conflicts, set rdma_cm module option unify_tcp_port_sapce to 1 by adding
the following to /etc/modprobe.conf:

    options rdma_cm unify_tcp_port_space=1


=======================
Loadable Module Options
=======================
The following options can be used when loading the iw_nes module by modifying
modprobe.conf file:

wide_ppm_offset = 0
    Set to 1 will increase CX4 interface clock ppm offset to 300ppm.
    Default setting 0 is 100ppm.

mpa_version = 1
    MPA version to be used int MPA Req/Resp (0 or 1).

disable_mpa_crc = 0
    Disable checking of MPA CRC.

send_first = 0
    Send RDMA Message First on Active Connection.

nes_drv_opt = 0x00000100
    Following options are supported:

    Enable MSI - 0x00000010
    No Inline Data - 0x00000080
    Disable Interrupt Moderation - 0x00000100
    Disable Virtual Work Queue - 0x00000200

nes_debug_level = 0
    Enable debug output level.

wqm_quanta = 65536
    Set size of data to be transmitted at a time.

limit_maxrdreqsz = 0
    Limit PCI read request size to 256 bytes.


===============
Runtime Options
===============
The following options can be used to alter the behavior of the iw_nes module:
NOTE: Assuming NetEffect Ethernet Cluster Server Adapter is assigned eth2.

    ifconfig eth2 mtu 9000  - largest mtu supported

    ethtool -K eth2 tso on  - enables TSO
    ethtool -K eth2 tso off - disables TSO

    ethtool -C eth2 rx-usecs-irq 128 - set static interrupt moderation

    ethtool -C eth2 adaptive-rx on  - enable dynamic interrupt moderation
    ethtool -C eth2 adaptive-rx off - disable dynamic interrupt moderation 
    ethtool -C eth2 rx-frames-low 16 - low watermark of rx queue for dynamic
                                       interrupt moderation
    ethtool -C eth2 rx-frames-high 256 - high watermark of rx queue for
                                         dynamic interrupt moderation
    ethtool -C eth2 rx-usecs-low 40 - smallest interrupt moderation timer
                                      for dynamic interrupt moderation
    ethtool -C eth2 rx-usecs-high 1000 - largest interrupt moderation timer
                                         for dynamic interrupt moderation


===================
uDAPL Configuration
===================
Rest of the document assumes the following uDAPL settings in dat.conf:

    OpenIB-cma-nes u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "eth2 0" ""
    ofa-v2-nes u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "eth2 0" ""


=======================================
Recommended Settings for HP MPI 2.2.7
=======================================
Add the following to mpirun command:

    -1sided

Example mpirun command with uDAPL-2.0:

    mpirun -UDAPL -prot -intra=shm 
           -e MPI_ICLIB_UDAPL=libdaplofa.so.1
           -e MPI_HASIC_UDAPL=ofa-v2-nes
           -1sided
           -f /opt/hpmpi/appfile

Example mpirun command with uDAPL-1.2:

    mpirun -UDAPL -prot -intra=shm 
           -e MPI_ICLIB_UDAPL=libdaplcma.so.1
           -e MPI_HASIC_UDAPL=OpenIB-cma-nes
           -1sided 
           -f /opt/hpmpi/appfile


=======================================
Recommended Settings for Intel MPI 3.2
=======================================
Add the following to mpiexec command:

    -genv I_MPI_FALLBACK_DEVICE 0
    -genv I_MPI_DEVICE rdma:OpenIB-cma-nes
    -genv I_MPI_RENDEZVOUS_RDMA_WRITE

Example mpiexec command line for uDAPL-2.0:

    mpiexec -genv I_MPI_FALLBACK_DEVICE 0
            -genv I_MPI_DEVICE rdma:ofa-v2-nes
            -genv I_MPI_RENDEZVOUS_RDMA_WRITE
            -ppn 1 -n 2
            /opt/intel/impi/3.2.0.011/bin64/IMB-MPI1

Example mpiexec command line for uDAPL-1.2:

    mpiexec -genv I_MPI_FALLBACK_DEVICE 0
            -genv I_MPI_DEVICE rdma:OpenIB-cma-nes
            -genv I_MPI_RENDEZVOUS_RDMA_WRITE
            -ppn 1 -n 2
            /opt/intel/impi/3.2.0.011/bin64/IMB-MPI1


========================================
Recommended Setting for MVAPICH2 and OFA
========================================
Add the following to the mpirun command:

    -env MV2_USE_RDMA_CM 1
    -env MV2_USE_IWARP_MODE 1

For larger number of processes, it is also recommended to set the following:

    -env MV2_MAX_INLINE_SIZE 64
    -env MV2_USE_SRQ 0

Example mpiexec command line:

    mpiexec -l -n 2
            -env MV2_USE_RDMA_CM 1
            -env MV2_USE_IWARP_MODE 1 
            /usr/mpi/gcc/mvapich2-1.2p1/tests/osu_benchmarks-3.0/osu_latency


==========================================
Recommended Setting for MVAPICH2 and uDAPL
==========================================
Add the following to the mpirun command:

    -env MV2_PREPOST_DEPTH 59

Example mpiexec command line:

    mpiexec -l -n 2
            -env MV2_DAPL_PROVIDER ofa-v2-nes
            -env MV2_PREPOST_DEPTH 59 
            /usr/mpi/gcc/mvapich2-1.2p1/tests/osu_benchmarks-3.0/osu_latency

    mpiexec -l -n 2
            -env MV2_DAPL_PROVIDER OpenIB-cma-nes
            -env MV2_PREPOST_DEPTH 59 
            /usr/mpi/gcc/mvapich2-1.2p1/tests/osu_benchmarks-3.0/osu_latency


===========================
Modify Settings in Open MPI
===========================
There is more than one way to specify MCA parameters in
Open MPI.  Please visit this link and use the best method
for your environment:

http://www.open-mpi.org/faq/?category=tuning#setting-mca-params


=======================================
Recommended Settings for Open MPI 1.3.2
=======================================
Caching pinned memory is enabled by default but it may be necessary
to limit the size of the cache to prevent running out of memory by
adding the following parameter:

    mpool_rdma_rcache_size_limit = <cache size>

The cache size depends on the number of processes and nodes, e.g. for
64 processes with 8 nodes, limit the pinned cache size to
104857600 (100 MBytes).

Example mpirun command line:

    mpirun -np 2 -hostfile /opt/mpd.hosts
           -mca btl openib,self,sm
           -mca mpool_rdma_rcache_size_limit 104857600 
           /usr/mpi/gcc/openmpi-1.3.2/tests/IMB-3.1/IMB-MPI1


=======================================
Recommended Settings for Open MPI 1.3.1
=======================================
There is a known problem with cached pinned memory.  It is recommended
that pinned memory caching be disabled.  For more information, see
https://svn.open-mpi.org/trac/ompi/ticket/1853

To disable pinned memory caching, add the following parameter:

    mpi_leave_pinned = 0

Example mpirun command line:

    mpirun -np 2 -hostfile /opt/mpd.hosts
           -mca btl openib,self,sm
           -mca btl_mpi_leave_pinned 0
           /usr/mpi/gcc/openmpi-1.3.1/tests/IMB-3.1/IMB-MPI1


=====================================
Recommended Settings for Open MPI 1.3
=====================================
There is a known problem with cached pinned memory.  It is recommended
that pinned memory caching be disabled.  For more information, see
https://svn.open-mpi.org/trac/ompi/ticket/1853

To disable pinned memory caching, add the following parameter:

    mpi_leave_pinned = 0

Receive Queue setting:

    btl_openib_receive_queues = P,65536,256,192,128

Set maximum size of inline data segment to 64:

    btl_openib_max_inline_data = 64

Example mpirun command:

    mpirun -np 2 -hostfile /root/mpd.hosts
           -mca btl openib,self,sm
           -mca btl_mpi_leave_pinned 0
           -mca btl_openib_receive_queues P,65536,256,192,128
           -mca btl_openib_max_inline_data 64
           /usr/mpi/gcc/openmpi-1.3/tests/IMB-3.1/IMB-MPI1


============
Known Issues
============
The following is a list of known issues with Linux kernel and
OFED 1.4.1 release.

1. We have observed "__qdisc_run" softlockup crash running UDP
   traffic on RHEL5.1 systems with more than 8 cores.  The issue
   is in Linux network stack. The fix for this is available from
   the following link:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git
;a=commitdiff;h=2ba2506ca7ca62c56edaa334b0fe61eb5eab6ab0
;hp=32aced7509cb20ef3ec67c9b56f5b55c41dd4f8d


2. Running Pallas test suite and MVAPICH2 (OFA/uDAPL) for more
   than 64 processes will abnormally terminate.  The workaround is
   add the following to mpirun command:

   -env MV2_ON_DEMAND_THRESHOLD <total processes>

   e.g. For 72 total processes, -env MV2_ON_DEMAND_THRESHOLD 72


3. For MVAPICH2 (OFA/uDAPL) IMB-EXT (part of Pallas suite) "Window" test 
   may show high latency numbers.  It is recommended to turn off one sided
   communication by adding following to the mpirun command:

   -env MV2_USE_RDMA_ONE_SIDED 0


4. IMB-EXT does not run with Open MPI 1.3.1 or 1.3.  The workaround is
   to turn off message coalescing by adding the following to mpirun
   command:

    -mca btl_openib_use_message_coalescing 0


NetEffect is a trademark of Intel Corporation in the U.S. and other countries.