File: CHANGELOG.txt

package info (click to toggle)
pyranges 0.0.111%2Bds-4
links: PTS, VCS
area: main
in suites: bookworm
size: 20,676 kB
sloc: python: 5,548; makefile: 29; sh: 6
file content (529 lines) | stat: -rw-r--r-- 15,746 bytes
parent folder | download | duplicates (2)
# 0.0.111 (01.10.2021)
- require minimum version of NCLS

# 0.0.110 (20.09.21)
- fix count_overlaps with keep_nonoverlapping=False
- fix subtract with more than 1024 intervals (new fix)

# 0.0.109 (16.09.21)
- fix overlap invert behavior
- add intersect invert flag
- fix subtract in cases where more than 1024 intervals overlapped a single interval

# 0.0.106/107/108(hotfixes) (07/8.09.21)
- fix join with slack mutating first arg
- add flag use_other_strand in join, nearest, k_nearest
- fix categorical-bug in newer versions of pandas
- add function pr.version_info() to print relevant version flags for debugging

# 0.0.105 (23.08.21)
- require bamread 0.0.10 to fix #211

# 0.0.104 (06/20.08.21)
- fix broken three_end/five_end code

# 0.0.102/103 (06.08.21)
- fix bug in pr.count_overlaps
- demand version 0.0.9 or greater from bamread

# 0.0.100/0.0.101 (20/21.06.21)
- add full-flag to read_gtf
- fix bug in join with slack > 0 when result is empty

# 0.0.99 (17.06.21)
- add nb_cpu arg to overlap

# 0.0.98 (07.06.21)
- fix k-nearest how=None

# 0.0.98 (20.05.21)
- fix casting in tss/tes

# 0.0.96/97 (07.05.21)
- fixes to .tes and .tss methods (issue #182)

# 0.0.95 (02.03.21)
- teensy fix bedclip
- add pretty-printing in jupyter notebooks (thanks to @rasi)

# 0.0.94 (27.02.21)
- print warning if start and end columns have different dtypes

# 0.0.93 (25.02.21)
- add max_disjoint for maximal disjoint set

# 0.0.91-92 (15.01.21)
- hotfix for 0.0.90

# 0.0.90 (03.01.21)
- fix #165 slow set operations on small files with many chromosomes (thanks ndukler)

# 0.0.89 (16.11.20)
- fix #159 (thanks cfriedline)

# 0.0.88 (09.11.20)
- fix bug when concatting stranded and unstranded pyranges (thanks cfriedline, issue #160)

# 0.0.87 (23.10.20)
- fix bug in join with left/right option

# 0.0.86 (05.10.20)
- add slack-option to merge

# 0.0.85 (17.09.20)
- fix error when parsing gtf-files with whitespace in value-tags

# 0.0.84 (18.08.20)
- add option to report overlap in join

# 0.0.83 (18.08.20)
- hotfix

# 0.0.82 (18.08.20)
- fix error introduced in 0.0.80

# 0.0.81 (13.08.20)
- fix Fisher's implementation

# 0.0.80 (10.08.20)
- fix reassigning chromosomes in apply

# 0.0.79 (08.06.20)
- fix bug in features.introns where the gene_id column was overwritten (issue #134)

# 0.0.78 (18.03.20)
- add reader for bigwig (pr.read_bigwig)
- fix cluster (allow for multiple by arguments)
- optimize to_bigwig slightly
- fix: overlap did not recognize invert-argument

# 0.0.77 (24.03.20)
- add api-docs
- make default strandedness of apply-pair equal None
- add pr.from_string() to create a PyRanges from a multiline string
- remove set_columns, set on .columns directly
- apply numpy-methods to pyranges
- add pr.get_fasta(gr, path)

# 0.0.76 (20.02.20)
- fix leftover print in itergrs

# 0.0.75 (20.02.20)
- reset index when reading pyranges from df
- ignore reinit error in ray
- did not use copy_df in init

# 0.0.74 (12.02.20)
- support for multiple (repeating) attributes in gtf reading
- fix handling of kwargs in apply, apply_pair, apply_chunks
- add to_example(nrows=10) to get a copy-paste friendly representation of a PyRanges
- add pr.from_dict() to create a PyRanges from a dict (like the ones produced with to_example)

# 0.0.73 (03.02.20)
- fix small bug in jaccard
- remove leftover debug-print in pr.random()
- add experimental gr.stats.forbes
- fix handling of kwargs in apply, apply_pair, apply_chunks

# 0.0.72 (03.02.20)
- random also takes dict as chromsizes argument (like {"chr1": 249, "chr2": 242})
- fix reldist bug when grs have different chromosomes

# 0.0.71 (30.01.20)
- fix various issues with reading and writing gtf/gff3 (1-indexing, removed final ";" in gff3 attribute col when writing)
- remove ModuleNotFoundException in __init__.py (3.5 < only)
- gr.overlap(gr2) now has default argument how="first", i.e. only return overlapping intervals once, even though there are multiple overlapping features in gr
- fix bug in pr.stats.mcc when using stranded data

# 0.0.70 (24.01.20)
- add Simes method to pr.stats
- add keys argument to pr.iter
- make strand=None default arg for concat
- gr.split() does opposite of merge
- pr.count_overlaps(grs, features=None) like bedtools multiintersect added
- set mkl.set_num_threads to 1 in __init__

# 0.0.69 (22.01.20)
- add value col argument to to_bigwig (thanks https://github.com/liyao001)

# 0.0.68 (21.01.20)
- fix regression: slack changes dtype from int32 to int64

# 0.0.67 (10.01.20)
- add dtypes attribute to pyranges
- fix left and right join when chromosomes missing

# 0.0.66 (03.01.20)
- add argument sparse to read_bam. Setting it to False fetches more columns.

# 0.0.65 (10.12.19)
- fix column names after read_gtf so they work with GenomicFeatures
- add flag chain, make False by default to to_* methods
- genomicfeatures: add tss/tes-methods
- fix column names after read_gtf so they work with GenomicFeatures
- remove Strand column with unstrand() even if PyRanges is not stranded
- reading gff and gtf now consistent and column names from attributes are in lower_snake_case

# 0.0.64 (28.11.19)
- add missing example data (ending with gz) to pyranges
- add rowbased_spearman, rowbased_pearson and rowbased_rankdata to pyranges.stats
- pyranges now accept columns with integer names, like pandas

# 0.0.63 (14.11.19)
- ignore index when inserting Series
- able to add dictionary of dfs to a pyranges
- remove FDR from fisher_exact, but add fdr as own method in stats
- make stats.mcc faster
- make stats.mcc work without a genome
- fix gff3 reading when metadata contains spaces

# 0.0.62 (11.11.19)
- fix fisher exact when given pd.Series
- fisher_exact: only use pseudocounts for OR

# 0.0.61 (11.11.19)
- add outer, inner and left join
- add fisher exact
- insert series/dataframes to pyranges with + operator
- gr.Whatever = pd.Series(...) now ignores index
- add gr.copy() method to create deep copy

# 0.0.60 (10.11.19)
- add k-nearest
- ensure that start/end have the same dtype after calling slack
- breaking change: new_position takes no default arg
- new_position takes an argument swap
- .length returns a python integer
- breaking change: lengths returns a vector by default

# 0.0.59 (28.10.19)
- fix attributerrors on pyranges (thanks https://github.com/MuhammedHasan)
- add reader for gff3
- add writer for gff3
- add count flag to cluster/cluster_by

# 0.0.58 (25.10.19)
- fix merge print functions
- make pickleable
- add iter as alias for itergrs in pr. namespace
- gr.length() shows nucleotide length (sum of all interval lengths)
- gr.lengths() takes as_dict=False flag to return as vector
- fix slack in join: added columns when joining with itself
- fix print for unstranded pyranges: printed tail and head of first chromosome

# 0.0.57 (10.10.19)
- add overlap-flag to tile
- add chain to print-method
- bugfix: printing stranded pyranges sorted output even though sort was false
- bugfix: wrong number hidden cols on very small terminal widths
- bugfix: unstrand did not change underlying dict to chromosomes only
- show number of hidden columns in header
- tests: mismatches in strand between dict and dataframes
- .df/.as_df() now returns with non-duplicated index

# 0).0.56 (25.09.19)
- add possibility of 5-end and 3-end in slack being different/none
- add slack to join-method
- add new_position method to take union or intersection of two pairs of Start/End-columns in pyranges

# 0.0.55 (13.09.19)
- Add int64-flag to method pr.random.

# 0.0.54 (10.09.19)
- Ensure that Chromosome and Strand is str dtype before creating category
- Add check to ensure that the columns Chromosome, Start and End exist when trying to create a PyRanges

# 0.0.53 (02.09.19)
- Fix error in pypi file

# 0.0.52 (30.08.19)
Fixes:
  - fix creating duplicate indexes in pyrange apply both
  - fix regression where joining unstranded and stranded pyrange did not make a stranded pyrange
  - default was strand=False for a few methods, should have been None (i.e. autodetect)
  - read_bed now handles gzipped bed (if the file has the .gz extension)
  - now able to print untraditional strands which are not strings
  - fix drop when "Strand" is part of what is to be dropped
  - more robust checking if column is in gr

Additions:
  - print functions take formatting-argument {"Start": "{:,}"}

Changes:
  - print shows sorted stranded data in Start/End order
  - print dynamically selects number of untraditional strands and hidden columns to display
  - read_bed now takes nrows arg
  - now assertion is raised if trying to drop "Chromosome", "Start" or "End" (instead of ignoring)
  - to_bed, to_gtf, to_csv now take compression argument ("infer" by default)
  - to_csv writes the header as default

# 0.0.51 (01.08.19)
Additions:
  - pr.itergrs added to iterate over the dfs from multiple pyranges at the same time

Changes:
  - pybigwig and bamread are optional dependencies that need to be manually installed (like ray)


# 0.0.50 (29.07.19)
Additions:
  - pr.random(n=1000, length=100, chromsizes=None, strand=True) creates a random PyRanges from a PyRanges of chromosome sizes.

Changes:
  - make __iter__ return natsorted items

Removals:
  - insert. use join instead

Fixes:
  - bug in boolean indexing due to __iter__ returning wrong sort order

# 0.0.49 (26.07.19)
Hotfix:
  - bug in assign (strand=False, by default, not None)

# 0.0.48 (25.07.19)
Additions:
  - head(n=8)
  - tail(n=8)
  - sample(n=8)
  - set_columns(new_names) to set new column names
  - argument like to drop, which takes string describing regex (gr.drop(like="_left|_right"))
  - add count (number of intervals) to merge and merge_by

Fixes:
  - 5X faster boolean indexing
  - fix some bugs in features.introns when data was missing

Changes:
  - coverage renamed to_rle
  - if drop used without argument, not dropping Strand by default

# 0.0.47 (19.07.2019)
hotfixes

# 0.0.46 (19.07.2019)
Additions:
  - cluster and merge takes argument by to only merge cluster within specific features
  - gr.features.introns added. Can use by="gene" or by="transcript"
  - new data: pr.data.gencode_gtf and pr.data.ucsc_bed
  - can subset pyrange with boolean vector
  - sort also takes argument by (sort without arg sorts on start/end)

# 0.0.45 (14.06.2019)
Fixes:
  - bug in subset which removed strand
  - bug when setting Strand with setattr
  - bug when setting Chromosome with setattr

Changes:
  - new method to compute cluster (3x as fast)
  - string-arg to drop not interpreted as regex
  - drop or keep do not take drop_strand. Only unstrand can drop strand.

Additions:
  - subsetting with new col order rearranges columns

# 0.0.44 (04.06.19)
Changes:
  - Now possible to reset Strand/Chromosome

Additions:
  - gr.drop_duplicate_positions(strand=None) # None means auto => true if stranded otherwise False
  - add test data pr.data.chromsizes()
  - pr.gf.tile_genome(genome_pyrange, tile_size, tile_last=False) (like GenomicRanges tileGenome)
  - pr.gf.genome_bounds(pyrange, genome_pyrange, clip=False) (like UCSC bedclip)

# 0.0.43 (29.05.19)
Fixes:
  - fix bug in tostring
  - fix bug in multithreading

Additions:
  - add apply_chunks, which operates on chunks, instead of chromosome-dfs.

Changes:
  - add nb_cpu argument to all functions
  - add number of columns and stranded/unstranded to tostring
  - add ... as last column, when there are more columns than possible to show
  - use , as thousands separator in tostring for number rows/cols


# 0.0.42 (16.05.19)
Additions:
 - allow keyword-arguments to apply, apply_pair (see example in the docs)

Changes:
  - to_csv etc. returns the objects themselves, so they can be used in method chains
  - methods called tile/window instead of tiles/windows


Fixes:
  - fix print when len(pr) < entries to print
  - tile



# 0.0.41 (14.05.19)
Additions:
 - add slack-flag to cluster/merge
 - print joined positions possible
 - add simple methods for printing without breaking the chain (p, mp, sp, tmp, rp)

Removals:
 - settings in pyranges. Added print methods instead.

Improvement:
 - print methods faster, especially for pyranges with many cols


# 0.0.40 (13.05.19)
Additions:
  - pyranges_db now out on PyPI

Changes:
  - PyRanges can now have Strand column with other data than "+" or "-", but it is considered unstranded.
  - Ensure that slack parameter is always integer.
  - no keep_metadata-flag in windows. Metadata is always kept. Can call drop() beforehand if metadata should not be kept.

Remove:
  - remove confusing keep flag from drop method (use gr[cols_to_keep] instead)

Fixes:
  - add missing ... in pyranges tostring

# 0.0.39 (09.05.19)
Removal:
  - remove sandbox module

# 0.0.37-38 (09.05.19)
Changes:
  - pyranges constructor is copy-by-default

# 0.0.36 (09.05.19)
Additions:
  - add insert method which uses overlap

Changes:
  - read_bed does not fail when strand is "."
  - read_bed considers bed unstranded if Strand has other values than +/-


# 0.0.35 (26.04.19)
Changes:
  - tssify/tesify renamed five_end/three_end
  - five_end/three end fails when data does not contain strand

Fixes:
  - slack changed pyrange in-place


# 0.0.34 (25.04.19)
Fixes:
  - assign changed pyrange in-place


# 0.0.33 (25.04.19)
Changes:
  - minor bugfix


# 0.0.32 (25.04.19)
Changes:
  - Use gr.to_bed for output_methods, not gr.out.bed
  - Remove copy_df flag in constructor; using df.copy() is terser
  - change flag extended in constructor to int64 (default False)


# 0.0.31 (24.04.19)
Changes:
  - Make int32 default for Start/End

Additions:
  - PyRanges now has window-function, like bedtools makewindows

Fixes:
  - getitem sometimes returned int32-pyrange despite being given int64-pyrange
  - doing nearest two times in a row sometimes failed due to minor suffix-bug


# 0.0.30 (23.04.19)
Changes:
  - Make col first argument of assign


# 0.0.29 (23.04.19)
Changes:
  - Move pyranges db to own module to remove mysql-requirement (made wheelmaking hard)

Additions:
  - add assign and subset methods on pyrange



# 0.0.28 (22.04.19)
- Only refer to and use ray in dispatcher

# 0.0.27 (22.04.19)
Fixes:
  - raise Exception when encountering non-"+-" Strand values


# 0.0.26 (15.04.19)

Additions:
  - pr.sandbox.Debug context manager for pipes

Fixes:
  - coverage errored with value_col

# 0.0.25 (15.04.19)
Additions:
  - Can set columns on a PyRanges using a dict of iterables
  - gr() takes subset and col argument, like dplyr mutate and select

Removed:
  - disallow eval string, must use lambdas, e.g.: gr(lambda df: df.Score > 0)

Fixes:
  - drop (and getitem) small fix
  - sometimes had empty dfs in dict because of unused categoricals



# 0.0.24 (15.04.19)
Hotfix:
  - left in dbg statements

# 0.0.23 (15.04.19)
Hotfix:
  - unstrand() did not always remove strand info

# 0.0.22 (14.04.19)
Additions:
  - pr.PyRanges() returns empty PyRange # before you needed pr.PyRanges({})
  - pyranges are now callable. Examples: gr("df.Score > 0") and gr("df.A.astype(str) + mysuffix")
  - can subset PyRanges with a dict of boolean vectors
  - pr.data.exons(), pr.data.cpg()
  - gr.unstrand() removes strand information from a PyRanges
  - throw exception if trying to drop Strand from df without setting drop_strand=True
  - adding a Strand column to the PyRanges makes it stranded

Changes:
  - write dtype as category, not int8/int16/...

Fixes:
  - remove empty dfs in the dict given to the PyRanges constructor

Removed:
  - gr.data.epigenome_roadmap()


# 0.0.21 (14.04.19)
Additions:
  - gr.cluster(): assign ID to each cluster found by merge
  - gr.columns: return the columns in the pyranges
  - gr.drop: drop columns based on regex or list
  - gr[["Score", "Name"]]: select subset of columns
Fixes:
  - gr.stranded errored if chromosomes were ints