File: FUTURE.txt

package info (click to toggle)
suitesparse 1%3A7.10.1%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: trixie
  • size: 254,920 kB
  • sloc: ansic: 1,134,743; cpp: 46,133; makefile: 4,875; fortran: 2,087; java: 1,826; sh: 996; ruby: 725; python: 495; asm: 371; sed: 166; awk: 44
file content (95 lines) | stat: -rw-r--r-- 3,784 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
FUTURE plans for GraphBLAS:

    JIT package: don't just check 1st line of GraphBLAS.h when deciding to
        unpack the src in user cache folder. Use a crc test.

    cumulative sum (or other monoid)

    Raye: link-time optimization with binary for operators, for Julia

    pack/unpack COO

    kernel fusion

    CUDA kernels
        CUDA: finding src
        CUDA: kernel source location, and name

    distributed framework

    fine-grain parallelism for dot-product based mxm, mxv, vxm,
        then add GxB_vxvt (outer product) and GxB_vtxv (inner product)
        (or call them GxB_outerProduct and GxB_innerProduct?)

    aggregators

    GrB_extract with GrB_Vectors instead of (GrB_Index *) arrays for I and J

    iso: set a flag with GrB_get/set to disable iso.  useful if the matrix is
    about to become non-iso anyway. Pagerank does:

        r = teleport (becomes iso)
        r += A*x     (becomes non-iso)

    apply: C = f(A), A dense, no mask or accum, C already dense: do in place

    JIT: allow a flag to be set in a type or operator to selectively control
        the JIT

    JIT: requires GxB_BinaryOp_new to give the string that defines the op.
    Allow use of
        GrB_BinaryOp_new (...)
        GrB_set (op, GxB_DEFN, "string")
    also for all ops

    monoids with terminal conditions (as a function), not just a terminal
        value

    candidates for kernel fusion:
        * triangle counting: mxm then reduce to scalar
        * lcc: mxm then reduce to vector
        * FusedMM: see https://arxiv.org/pdf/2011.06391.pdf

    more:
        * consider algorithms where fusion can occur
        * performance monitor, or revised burble, to detect generic cases
        * check if vectorization of GrB_mxm is effective when using clang
        * see how HNSW vector search could be implemented in GraphBLAS

    printing user-defined types

    Support for COO / Tuple formats:

        The Container method could be extended to the COO / Tuples format.  It
        would be like GrB_Matrix_build when moving the tuples to a matrix, but
        the method would be faster than GrB_Matrix_build /
        GrB_Matrix_extractTuples.  The row indices, column indices, and values
        in the Container could be moved into the matrix, saving time and space.
        I already have this capability in my internal methods but there is no
        user interface for it.

        The Container could include a binary operator, which would be used to
        combine duplicate entries.

        The COO -> Container -> GrB_Matrix construction would not take O(1)
        time and space, but it would be faster and take less memory than
        the existing GrB_Matrix_build.

        I think this option would be important for the SparseBLAS, to allow for
        fast load/unload of COO matrices into/from a GrB_Matrix or GrB_Vector.
        They anticipate supportng a COO matrix format, with the option of
        creating a corresponding matrix_handle for the data (for GraphBLAS,
        the matrix_handle would be the GrB_Matrix).

        Alternatively, I could create new GrB_build variants (or variant
        behavior via the descriptor) that "ingest" the input GrB_Vectors I,J,
        and X, and I could create a GrB_extractTuples variant that moves its
        data into GrB_Vectors I,J, and X, returning the GrB_Matrix as empty
        with no content.  This could be done via the descriptor, using the new
        GxB_*_build*_Vector methods described above.

        Lots to work out for this, and it's unrelated to the 32/64 bit issue
        (except that the introduction of GrB_Vectors I,J,X to GrB_build and
        extractTuples makes this feature more feasible).  Thus, I don't expect
        to add it to GraphBLAS v10.0.0.