1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104
|
Question to ask user & general support
=======================================
o Platform and compiler and boost version ?
It may be compiler/boost version that we have not tested ?
o Version
o Are they running on a virtual machine ?
- Check size of VM for ecflow_server, see below about log file ECFLOW-174
o Need the definition file.
- Gives a que to the size of data transmitted for client/server communication.
- We can use Pyext/test/TestBench.py to load to an existing/new server
to test the customer defs.
o Log file
- Will verify the server version
- show server load, kind of requests, performance issues i.e news/sync
- ** Look out for --log=get in ecflow version < 4.0.6 **and** where the log file is huge.
** This is because the server read in the whole of the log file, and therefore consumed to much memory
** On virtual machines this may cause server to hang.
** Ask user the VM taken up by the server
** ECFLOW-174
- Watch out for --log=get <large number> this will have a same effect if <large number> > 1000000
o Ask user to supply info returned from:
ecflow_client --port< > --host<> --stats
Will verify server version and the total number of requests handled by the server
Will also record the request handled in the last minute.
ecflow_client --port< > --host<> --ping
From the host/machine where they are running ecflowview
See if there is any networking issue.
Should vary between 1-65 milli second response for decent network response
itanium ->linix(11.3) ~ 12-65 ms
ecgate ->linix(11.3) ~ 6-27 ms
c1a ->linux(11.3) ~ 18-50 ms
c2a ->linux(11.3) ~ 13-26 ms
linux(10.3)->linux(11.3) ~ 1-7 ms
linux(11.3)->linux(11.3) ~ 1-7 ms
If it takes long, ask them to do:
ping -R <host>
- Look for variation in ping times, along a given network path.
- Look for number of pings on a network path getting no response
This records the packet routing(network path) information,
and might indicate why its slow.
o If there is a performance problem::
* First establish if its *consistent* and reproducible*
- Establish if they are using the right version of ecflowview.
Post 3.1.0, we now put version in the ecflowview window title.
- Check ecflowview Poll rate, if this is excessive it can interfere with interactive use.
i.e Edit->Preference, select "Server Options" tab
Check "Call server every" <> seconds. If this is <15 seconds, it could explain
slow interactive use.
The following should be done *before* trying to reproduce the problem
* If the using version < 4.0.6 and have large log files, check for '--log=get' in log file.
This would have had an effect on the server, especially if running on VM.
* In ecflowView, select the server, RMB->Extra->Client Logging on
This will produce a file called:
>> ecflow_client.log
This will log the round trip time from the client ->server ->client
for all user commands make by the GUI.
Please send the file back.
make sure they disable it afterwards, ie RMB->Extra->Client Logging off
* Run strace;
- ps -ef | grep ecflow_server # not the pid
- strace -p <pid> strace.txt 2>& 1
*OR* strace -p <pid> -o tmp.txt
This will record all the system level commands, there arguments and return values,
additionally we can see the *size* in bytes of the client/server communication.
watch out for sendmsg/recvmsg.
Type CTRL Q, to stop recording, and send the file.
* If server is stuck.
find process id of server:
>> fstack <process-id> # use fstack to dump the back trace of where it is stuck
* cannot connect, or server is dropping connection: ('Server::handle_read: End of file')
Use netcat: on some systems may it is called nc:
server side:
> netcat -p <port> -l -v # -l list for inbound connection, -v verbose mode
> netcat -p 3141 -l -v # use netcat to listen on port 3141, in verbose mode
client side:
> netcat <host> <port>
> netcat ecflow-metab 3141
> type a message this should be seen on the server side
|