File: USECASE.txt

package info (click to toggle)
python-bumps 0.7.11-2
  • links: PTS, VCS
  • area: main
  • in suites: buster
  • size: 10,264 kB
  • sloc: python: 22,226; ansic: 4,973; cpp: 4,849; xml: 493; makefile: 163; perl: 108; sh: 101
file content (60 lines) | stat: -rw-r--r-- 1,699 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
SERVICE/WORKER ARCHITECTURE
* pause/resume job
* cancel job
* cancel map
* slow node
* one data point more expensive than the others
* infinite loop for some values
* exceptions for some values
* worker not stateless: one calculation interferes with the next
* worker memory leak
* worker requires initial state
* worker state changes between map calls

* client requests logging on service, any worker, or all workers
* logging based on condition
* rerun value with logging when worker fails

* parallel random number generator, with seed control

* worker which needs workers

CLUSTER MANAGEMENT
* priority jobs
* start/stop workers when job starts
* add client to processing pool, with preference for client jobs
* add new node
* machine reboots: exchange, service, worker, monitor
* unreliable processor/memory on some nodes
* workers which use local disk but don't clean up after themselves
* service monitoring
* identify all processes for a given user
* identify idle machines

JOB MANAGER
* job manager machine reboots
* job manager upgrades
* progress thumbnail
* client detach/attach
* retrieve results
* notify user when job starts (message queue or email)

DEPLOYMENT
* egg basket and trusted users
* different users having conflicting dependencies
* worker implemented in java, IDL, Matlab, R, ...
* client implemented java, IDL, Matlab, R, ...
* code movement while developing worker

DATA MANAGEMENT
* big file, independent disks
* storing run results

SCALABILITY
* workstation, dedicated cluster, batch queue, NoW, EC2, teragrid

SECURITY (SNS, Teragrid, shared clusters)
* service/workers run as user
* exchange runs as user
* job manager behind firewall
* cluster behind firewall